March 1956 


Z2FIOMETRICS 


Vol. 12 Neo. I 
JOURNAL OF THE BIOMETRIC SOCIETY 


Fractional Replication for Mixed Series Milton Morrison 


Block Effects in the Determination of 
Optimum Conditions Robert M. DeBaun 


Adjustment by Covariance and Consequent Tests of 
Significance in Split-Plot Experiments Jeanne Titus Truett 
and H. Fairfield Smith 


A Note on the Combination of Estimates of Relative 

Potency in Multiple Assays Pamela M. Clarke 
Missing and “Mixed-Up” Frequencies in Contingency 

Tables G. S. Watson 


Contributions to Simultaneous Confidence 
Interval Estimation i K. V. Ramachandran 


Random Genetic Drift in a Tri-Allelic Locus; Exacr Solution 
With a Continuous Model Motoo Kimura 


Multivariate Analysis and Agricultural Experiments D. J. Finney 


The Relation Between Quantal and Graded Responses to 
Drugs P. S. Hewlett and R. L. Plackett 


One Likelihood Adjustment May be Inadequate H.W. Norton 


ECE TET ERE 
= OOOO 
— 
| 
; i 


| 
{ 
4 
J 
my 
~ 


The Biometric Society 


ETRICS 


FoUNDED BY THE BIOMETRICS SECTION OF THE AMERICAN STATISTICAL ASSOCIATION 


TABLE OF CONTENTS 


Fractional Replication for Mixed Series . . . Milton Morrison 1 


Block Effects in the Determination of Optimum Conditions 
Robert M. DeBaun 
Adjustment by Covariance and Consequent Tests of Significance 

in Split-Plot Experiments 
Jeanne Titus Truett and H. Fairfield Smith 


A Note on the Combination of Estimates of Relative Potency in 
Pamela M. Clarke 40 

Missing and “‘Mixed-Up”’ Frequencies in Contingency Tables 

G. S. Watson 


Contributions to Simultaneous Confidence Interval Estimation 
K. V. Ramachandran 


Random Genetic Drift in a Tri-Allelic Locus; Exact Solution 
With a Continuous Model ........ Motoo Kimura 


Multivariate Analysis and Agricultural Experiments, D. J. Finney 


The Relation Between Quantal and Graded Responses to Drugs 
P. 8. Hewlett and R. L. Plackett 


One Likelihood Estimate May Be Inadequate . H.W. Norton 79 


Abstracts 


The Biometric Society 


The Varenna Seminar in Biometry .. . . L. L. Cavalli-Sforza 93 
News and Notes 


Number 1 March 1956 Volume 12 


A 
cal 
20 
23 i 
4 57 
: 
QuericS 82 
89 
P 


Material for Biometrics should be addressed to Dr. J. W. Hopkins, National Research 
Council, Ottawa 2, Canada, except that authors residing in one of the following 
organized regions can expedite the handling of their papers by submitting them to the 
Assistant Editor for that region. 


British Region: Dr. D. J. Finney, Dept. of Stat., Univ. of Aberdeen, Aberdeen, Scot- 
land; Australasian Region: Dr. E. A. Cornish, University of Adelaide, Adelaide, 
Australia; French Region: Dr. Georges Teissier, Faculte des Sciences de Paris, 1 rue 
V. Cousin, Paris, France. ‘ 


Material for Queries should go to Professor G. W. Snedecor, Statistical Laboratory, 
Iowa State College, Ames, Iowa. 


Articles to be considered for publication should be submitted in triplicate. 


THE BIOMETRIC SOCIETY 
General Officers 


President, E. A. Cornish; Secretary, M. J. R. Healy; Treasurer, C. I. Bliss: Council, 
F. J. Anscombe, C. Barigozzi, W. G. Cochran, G. M. Cox, Georges Darmois, B. B- 
Day, D. J. Finney, J. H. Gaddum, M. P. Geppert, A. Groszmann, P. C. Mahalanobis, 
Donald Mainland, Leopold Martin, M. Masuyama, P. A. Moran, J. Neyman, C. R. 
Rao, J. W. Tukey, E. J. Williams, Frank Yates. 


Regional Officers 


Eastern North American Region: Regional President, D. B. Duncan; Secretary- 
Treasurer, A. M. Dutton. British Region: Regional President, D. J. Finney; Secretary, 
E. C. Fieller; Treasurer, A. R.G.Owen. Western North American Region: Regional 
President, W. J. Dixon; Secretary-Treasurer, Elizabeth Vaughan. Australasian Re- 
gion: Regional President, E. J. Williams; Secretary, G. 8. Watson; Treasurer, G. A. 
McIntyre. French Region: Regional President, Georges Darmois; Secretary-Treasurer, 
Daniel Schwartz. Belgian Region: Regional President, Paul Spehl; Secretary, Leopold 
Martin; Treasurer, Claude Panier. Italian Region: Regional President, G. Barbensi; 
Secretary, L. L. Cavalli-Sforza; Treasurer, R. Scossiroli. German Region: Regional 
President, E. Ullrich; Secretary-Treasurer, W. Ludwig. 


National Secretaries 


Denmark, N. F. Gjeddebaek; The Netherlands, E. van der Laan; India, V. G. 
Panse; Japan; M. Hatamura; Switzerland, Arthur Linder; Sweden, H. O. A. Wold; 
Brazil, Americo Groszmann. 


Editorial Board 
Biometrics 
Editor: J. W. Hopkins; Editorial Associates and Committee Members: C. I. Bliss, 
Irwin Bross, E. A. Cornish, W. J. Dixon, Mary Elveback, Ralph Bradley, D. J. 
Finney, 8S. Lee Crump, Leopold Martin, Horace W. Norton, O. Kempthorne, G. W. 
Snedecor and Georges Teissier. Managing Editor: J. W. Hopkins. 


The Biometric Society is an international society devoted to the mathematical and statistical 
aspects of biology. Biologists, mathematicians, statisticians and others interested in its objectives 
are invited to become members. Through its regional organizations of the Society sponsors regional 
and local meetings. National secretaries serve the interests of members in Brazil,“Denmark, India, 
Japan, The Netherlands and Sweden, and there are many members at large. Rates (in U.S.A. 
currency) for full membership in the Society for 1955, including dues for a subscription to 
BIOMETRICS, are the following: for residents of Canada and the United States $7.00, for 
members resident in other parts of the world $4.50. Members of the American Statistical Associa- 
tion who are currently subscribing to BIOMETRICS through that organization may become mem- 
bers of The Biometric Society on the payment of $3.00 annual dues if resident in the United 
States or Canada, and of $1.75 annual dues if resident elsewhere. Information concerning the 
Society can be obtained from the Secretary-Treasurer, The Biometric Society, Box 1106, New 
Haven 4, Connectitcut, U.S.A. 

The annual subscription to BIOMETRICS for non-members of The Biometric Society is 
$7.00, payable to the Managing Editor, BIOMETRICS, National Research Council, Ottawa 2, 
Canada. Members of the American Statistical Association may subscribe to BIOMETRICS 
through the Secretary of the American Statistical Association at $4.00 per annum. 


Second-class mail privileges authorized at New Haven, Conn. Additional entry 
at Richmond, Va. Business Office, 52 Hillhouse Ave., New Haven, Conn. Biometrics 
is published quarterly—in March, June, September and December. 


ii 


| | 
| 
| 
i 
| 
i 
| 
. 
i 
q 
1 
| 
| 
4, 


FRACTIONAL REPLICATION FOR MIXED SERIES 


Mitton Morrison* 
Experimental Towing Tank, Stevens Institute of Technology 


1. Introduction 


The type of experimental designs to be described in this paper 
was necessitated by a situation which often occurs in engineering 
research. In such situations, a considerable number of independent 
variables are involved, and project engineers are accustomed to pre- 
senting their results in the form of large numbers of families of curves 
obtained by keeping all independent variables constant except one. 
Since each independent variable (factor) is assigned several values 
(levels) the number of test points to be obtained becomes large, some- 
times forbiddingly so. Virtually all the test saving experimental 
designs which appear in statistical literature require either that the 
factors involved are at a convenient number of levels (for example, 
much has been written on designs in which all factors are at the same 
number of levels), or that some or all of the two factor interactions are 
zero. These conditions rarely obtain in engineering experimentation 
and hence the standard designs cannot be used except in special 
situations. 

However, the method known as fractional replication gives promise 
of alleviating this situation. Using fractional replication, it is possible, 
under certain conditions, to test only a portion of all the possible combi- 
nations of levels of factors; and yet be able to test hypotheses on the 
existence of all main effects and all two factor interactions. What is 


*This work was sponsored by the Office of Naval Research. 
A portion of this paper was presented at the New York University meeting of the Section on 
Physical and Engineering Sciences of the American Statistical Association, May 1959. 


1 


i 
} 
{ 
| 
| 
; 
| 


9 
~ 


BIOMETRICS, MARCII 1956 


more important from the point of view of the test engineer is that it is 
also possible to estimate all main effects and two factor interactions and 
hence estimate that portion of the observations which have not been 
made under the plan, so that if he chooses to present results graphically, 
all points obtained, either by actual tests or by estimation, can be 
plotted. 

Unfortunately, however, the method of fractional replication is 
strongest when all factors are at two levels, and such situations are not 
too frequently encountered. However, in the next section, a method 
will be described whereby fractional replication can be used to ad- 
vantage even when all factors are not at the same number of levels, 
under conditions which often exist in experimentation performed for 


engineering research programs, and indeed in other fields also. These 
conditions are: 


a. When the primary interest is in reducing the number of observa- 
tions required in a complete replication. 

b. When three-factor and higher order interactions can be con- 
sidered to equal zero. 

c. When the observations are independent and their distributions 
have equal variances. 

d. When there are at least five factors. The development which 
follows will be mostly in connection with the five factor situation 
but the method is applicable virtually without alteration when 
there are more than five factors. The method could be used with 
fewer than five factors, but the considerable confounding of main 
effects and interactions in such situations would weaken its 
effectiveness. 


2. The Procedure 


The method requires that the total number of possible data points 
be expressed in the form: z,(2") + 2,-:(2""") + --- + 2,(2) + 2, where 
2) equals 0 or 1 depending on whether there are an odd or even number 
of possible observations. For all groups in which the exponent is equal 
to or greater than five, a straightforward fractional replication is 

‘possible which does not demand that any two factor interaction be 
assumed to equal zero. By means to be described, half the observations 
associated with the remaining terms (except 2 , if 2 = 1) can be dis- 
pensed with. The procedure will be illustrated by an example. 

Assume there are five independent variables, four at 2 levels and 
one at 3 levels. If all possible combinations are tested, there would be 


\ 
{ 


| 
7 
why 


FRACTIONAL REPLICATION 3 


(2) (2) (2) (2) (3) = 48 observations. Write: 48 = (2) (2) (2) (2) (24+ 1) 
and then to preserve the association between each number and the 
factors and levels it represents, write: 


2(a,a2) -2(b, bz) -2(d, dz) - [2(ere2) + 1(es)] (1) 
Carrying out the formal multiplication in (i) we obtain 
2°(a,a2)(by bs) (did) + 2*(a,a2)(b, be) (es) (ii) 


The total number of observations, 48, can be expressed as 48= 1-2° + 
1-2*. Thus, in this example, z, = 2, = 1;2, = 2 = 2, = % = 0. 

The first term in (ii), 2° (a,a2) (c,e2) (d,dz) (e,e2) corresponds 
to the 2° observations which can be obtained by taking all possible 
combinations of A, B, C, and D and the first two levels of FE. Then 
for this portion of the design, a half-replication which permits estimation 
of all main effects; A, B, C, D, E; and two-factor interactions AB, AC, 
AD, AE, BC, BD, BE, CD, CE, DE (higher order interactions assumed 
to be zero) would be specified by the first 16 rows of (iii). The obser- 
vations thus specified were selected by making the ABCDE interaction 
the defining contrast, as described in Reference [1.] 

Next consider the second term in (ii), 2‘ (a,a2) (bjb2) (c,¢2) (did) 
(e;). This term corresponds to the 2* observations which can be obtained 
by taking all possible combinations of the ; ; ; did, ; at 
e = e,. To choose half these observations, suppose first we had the 
problem of testing or estimating all observations which could be made 
for all combinations of A, B, C, and D and the second and third levels 
of E; which, of course, includes the second term in (ii) as a one-half 
subset. Then a one-half replication would be the same as that performed 
for the first term in (ii) except that e, would be replaced by e, . However, 
e, appears in only 8 of the 16 observations in (iii). Hence, it would be 
necessary to make only 8 additional observations in order to obtain by 
observation and estimation all 16 combinations represented by the 
third term in (ii). 

Thus, from half the observations corresponding to the first and 
second terms of (ii), the other half could be estimated. The method is 
easily extended to situations where the factors are at higher levels than 
for the experimental situation discussed above, or for situations where 
there are more than five factors. The designs obtained under the above 
described procedure will be called “estimation” designs. 

The design. has been formed using a method of estimation as the 
motivating element. This method of estimation has much to recommend 
it, though it differs from the traditional procedure. Once the design is 


\ 


4 BIOMETRICS, MARCH 1956 


made, however, one need not retain the method of estimation but can 
instead form least squares estimates. 

One-half the observations associated with 2* (a,a.) (b,b2) (c,¢2) 
(d,d,) (e3) have been chosen by the means described above and so under 
this design, the experimental conditions at which observations are to 
be taken are: 


a 
a 


(iii) 


to 


to 


Ne NNR NK NRK NK NN 


to 


This, then, describes the manner in which the design is made. Had 
there been a 2° term, four more observations would be dictated by 
the procedure, and similarly for 2” and 2. If all factors are at an odd 
number of levels, an even half-replication will, of course, be impossible. 
To illustrate further the manner of setting up the designs, we include 
the designs for the 2* X 4 and 2° X 3 X 5 half replicates. 

For the 2* X 4 experiment we write: 


[2(a,a2) }[2(0, be) ][2(e:e2) + 2(eses) ] 


| 
| 
4 


FRACTIONAL REPLICATION 5 


so that the observations to be taken are simply half-replicates of both 
terms as shown in (iv). 


(bib2) (cic2) (did2) (e1e2) 25(aia2) (bib2) (cic2) (d\d2) 


21111 21113 
12111 12113 
11211 11213 
11121 11123 
11112 11114 
22211 22213 
22121 22123 
22112 22114 (iv) 
21221 21223 
21212 21214 
21122 21124 
12212 12214 
12221 12223 
12122 12124 
11222 11224 
22222 22224 


Note the symmetry of the design and the ease with which it is 
formed. This is characteristic of experimental situations in which all 
factors are at two levels. 


For the 2° X 3 X 5 experiment, we write: 
[2(a,a2)][2(b, [2(dids) + ds][2(re2) + 2(eses) + es] 
+ + (ds) (ees) 
+ + (ds) 5) 


The design for the half-replication is given by Table I. In this table 
subscripts obtained by the replacement process are in the same row. 

Columns (1) and (2) in Table I are ordinary half-replications; 
column (3) is obtained by assuming that it is required to find by obser- 
vation or estimation all observations associated with 2° (a,a,)(b,b2) 
(c,c2)(d.d;)(e,e2). To do this one would need a half-replication con- 
sisting of sixteen observations, eight of which have already been made 
for column (1); the remaining eight are obtained from column (1) by 
using all observations in this column which have 1 in the fourth place. 
Change these 1’s to 3’s to give column (3). A similar procedure gives 
the remaining columns. 


| 
| 
| 
7 
| 


6 BIOMETRICS, MARCH 1956 


TABLE I 
Design for Half-Replication of 2? X 3 X 5 Experiment 


(1) (2) (3) (4) (5) (6) 
2 25 2 28 
(a,a2) (a;a2) (aia2) (a,a2) (a,a2) (a,a2) 
(b,b2) (bibe) (bib2) (bib2) (bib2) (bib2) 
(cic2) (c1c2) (cyc2) 

(d,dz) (did2) ds ds (d,d2) ds 
(e1€2) (e1€2) (es€4) es €5 
21111 21113 21131 21133 21115 21135 
12111 12113 12131 12133 12115 12135 
11211 11213 11231 11233 11215 11235 
11121 11123 11125 

11112 11114 11132 11134 

22211 22213 22231 22233 22215 22235 
22121 22123 22125 

22112 22114 22132 22134 

21221 21223 21225 

21212 21214 21232 21234 

21122 21124 

12212 12214 12232 12234 

12221 12223 12225 

12122 12124 

11222 11224 

22222 22224 


3. Least Squares Estimates of the Effects 


Least squares estimates of the effects will be found for the 2* x 4 
experiment and in the development comments will be made bearing 
on the results which would be obtained in five variable experiments in 
which the factors are at other levels. Some extensions, as for example 
experiments in which all factors are at an even number of levels, will 
be obvious. 

In the analysis, a function of many variables subject to many 
restrictions must be differentiated. The side restrictions will not be 
used until the differentiation has been carried out. If the solutions 
thus obtained satisfy the restrictions, they are the required solutions. 

In what follows, it should be understood, that wherever a sum- 
mation of observations is given over indicated subscripts, it means that 
the subscripts assume the combinations of values which correspond to 


| 


| 
j 
4 
| 


FRACTIONAL REPLICATION 7 


observations in the design; it does not mean that the summation is over 
all values of the subscripts, fer this would be the case for a full repli- 
cation. The subseript } on the summation variables is a reminder of 
this. It also follows that the ensuing discussion does not hold for 
experiments in which all the factors are at an odd number of levels, for 
in that case a 3 replication is impossible. It is still possible to set up 
a design for such cases. However, finding the least squares estimates 
would be a formidable task, and the analysis of variance for the experi- 
ment would almost certainly not be elegant. 
Let 2; represent an observation; where 


j = 1,2, ,J 


Let 
Q 


(iikim)i/a 


(a8) i; — — — im 
— (By) ix — (88)j: — (Be) im 

— (¥€)em 

(5€) rm] 


subject to the restrictions: 


In what follows, carats will be placed over the parameters, to indicate 
that the solutions of the equations are estimators. 
Setting dQ/du = 0, we obtain 


A A A a 
— — — B; & ~ 4, 
(iikim)i/a 


— — a — — (in 


— Gn — Badin — — Gu — Fan — = 0 


> 
+ 
k= 1,2 K i Ag 
eee 
m=1,2,-->,M 
4 
Fis 
fix Bg 


8 BIOMETRICS, MARCH 1956 


Summing term by term, 


a@ = 


(iiklm)i/a 


similarly for 


B; ete. 


(iiklm)i/s 


= }KLMaB,, + 


(iiklm)i/s 
+ + = 0; 
similarly for all other interactions. This gives 


Liikim 
A (iiklm)1/2 


“= 


Note, in the above, that the step, )ocsermy.;2 @: = 0 depended on 
a, and a, being involved in the same number of observations. This 
could not be the case for a factor at two levels in an experiment in which 
one factor is at two levels, and all the others at an odd number of levels, 
for in such a case, a one-half replication would have an odd number of 
observations. 

Setting 0Q/0a;, = 0, we obtain 

Ciklm)i/a 


— — — — (@irm 
— (BY) ix — — Bim — Car — Fam — = 0 
which leads to 
Lititim — — = 0 


(iklm)i/a 


From this, 


Vit jkim Vijkim 


a (iklm)i/2 (iiklm)1/2 
i = 


1JKLM 1JKLM KLM 


Obviously 


i 


> = 0 


| 


43 i 
hea 
> & = IJKLMé, + = 0; 
4 
| 
| 


FRACTIONAL REPLICATION 9 


Setting 0Q/0(a8);-;, equal to zero, we obtain 


A A A 
(kim) 1/2 


— — — Coin — GOu — Ore — Gora] = 0 
From this it follows that 


(klm)ifs 


— }KLM6,. — }KLMGB),.;, = 0 


and 
Lijkim 
(elm) a/2 s/s 
Xitikim Lijtkim 
ikim)i/2 (iklm)1/>2 
,JKLM 


Analogous results are obtained for the other interactions, and it can also 
be verified that the restrictions on the interactions are satisfied. Note 
that the estimates are exactly what one would expect them to be on an 
intuitive basis. 

To investigate the orthogonality of the estimates, we form Table IT 
which gives the least squares estimates for the 2* X 4 half-teplication; 
the constant multiplier, 1/32, should be associated with each term. It is 
clear that the least squares estimates of the effects for any experiment in 
which all factors are at an even number of levels are given by formulas 
analogous to those on the previous pages. It is not obvious, however, 
when some of the factors are at odd levels, and it has already been shown 
that if too many factors are at an odd number of levels, the formulas 
do not hold at all. 


4. The Analysis of Variance 


The elegance of the analysis of variance for the designs will vary 
considerably. Although no full investigation has been carried out, 
it appears that the more factors there are at an even number of levels, 
the neater the analysis. This will be demonstrated by a comparison 
of the 2* X 4 and 2* X 3 half-replications. 


‘ 
M 
ay 
> 
2 
4 
i 
2 
’ 
3 > 
é 
| «a 
4 
j 


+ + + + + + + + + BH + + 
- - = = = & = = + + + 
- @& - & + 8& + + So = tre 
- @& - - + + + & = & - te 
- - + @& + + + + + + + +t 
+ + + + + + - - + + + +-- ++ -- 
+ + + + + + + + + = - + = = = 
- - & - + + 8 + + + + 
- + + + - - - + + + + + - 
+ + + Ge - SF = + + - - 
- - & - + + + + + 8 + BF =~ 
+ + & + + + + + + + + BH 
- - = = = = = = + + + + 4+ 4 
+ + + - - - - - - - - + -+4+- - 
+ + + - = St + + + + + 
+ + + + + + = = = BH + BH - 
+ + + ¢- + + + + + + €- - - - + - 
(ad) (AQ) (ag) *(ag) (ag) AW) (AV) (Av) G2 dd OF GV OF AV 


FIZZ 
FLIZS 


+ 


it itt +4 


| 
i+ti++4 


| 


ra 


bt 


| 


| 


+4 


g 


»Z JO 10; syooyy jo 


Il ATAVL 


" 
| | 
| 
} 
. 
| 
| 
| | 
| | 
| i 
| 
| 


FRACTIONAL REPLICATION 11 


For the 2‘ X 4 half-replication, all main effects and two-factor 
interactions are orthogonal. Hence, the breakdown of the degrees of 
freedom is straightforward and the degrees of freedom for all main 
effects and two factor interactions can be isolated. The breakdown 
of the degrees of freedom is as follows: 


Effect 


Q 


Residual 


Total 


w 


In Table III estimates of effects for a 2 X 3 experiment have been 
specified. Note that the following two-factor interactions are partially 
confounded. 


AB and CD 
AC and BD 
AD and BC 


All other main effects and two factor interactions are orthogonal. On 
further examination, however, it can be seen that confounded inter- 
actions are not linearly dependent, that is, they are only partially 
confounded, and it will be possible to extract the sum of squares as- 
sociated with the six degrees of freedom carried by AB, C ‘D, AC, BD, 
AD and BC. Itis noteworthy that this permits a veins mean square 
free of two factor interaction, though carrying very few degrees of 
freedom. 


at 
4 
= 
| 
& 


= 
(90) — = %1(3”) 
= 11(9”) 
- 8 - - 8 
z - z - - + 
z - + + 8 
- - + + + 
+ + - - + 
+ + + + - 
+ + + + + 
= + + - 
+ §- + - 
- 2 - + + + 
+ + - & - + 
+ + + + 
+ + + + + 
- = 8 - - 2 + + 
- - + + - = 
+ + @ - - - 
- - 8 + + @- + + 
+ + @- - = @ + + 
+ + + + - 
+ + + + + + 


= 
= 


= 


- @ + + 
+ - - 
+ @- + + 
+ + + 
- + + 
+ - - 
— — 
+ + @ + 
- + + 
- + @ + 
+ - @ ~ 
+ 
— 
+ + B- + 
+ + 
- 
z - - 8 
z - 8 
+ + 
+ + 
- 8 


do ‘dq 


‘ 


‘av 


‘OV. 


uo OS puB 


+ + + 
+ - 
- - + 
- - + 
+ 
+ + + 
+ + + 
- - + 
- - + 
+ 
+ _ 
+ + + 
+ + + 
- - + 
- - + 
+ + + 
aa oa 


1441 


4 


4 


0) = = gy 


+1 


TTT7 


ann 


an 


NNN 


NNNANANAN 


NNANANN 


< 


+ 


NA 
NNN 


NAANNAA 
NNR 
N 


NN 


#2/1 Aq paydy 


¢ X »Z JO 10} jo 


Il 


{ 
44 
(ss | 
(= 
(= 
= 
i 
a 
% 
44444444 45 | 
NNN 


FRACTIONAL REPLICATION 
The breakdown of the degrees of freedom would be as follows: 


Effect 


AE 


ONNNNON 


Residual 
Total 


to 
w 


Of course, if it is known that three or more of the six interactions 
involving the two-level factors are zero, the single degrees of freedom 
for the remaining interactions can be listed separately. 

It should be mentioned that when the method of forming the designs 
leads to an analysis of variance so complicated as to be prohibitive, it 
is always possible to carry out analyses on convenient portions of the 
data which may be appropriate to the problem at hand. It is hoped 
that the situations which admit convenient analyses of variance for the 
type of design described, can be studied and discussed in the future. 
At present, it can only be said that it appears that the more factors 
there are at an even number of levels, the easier the analysis of variance. 


5. The Estimates of the Missing Observations. 


In this section, the expression “missing observation” is used with a 
somewhat different meaning than is customary. Usually a missing 
observation is one which was planned for but not obtained. In what 
follows, however, it refers to an observation which was purposely not 
obtained. 

Estimates of the missing observations are found by simply combining 
the estimates of the effects with appropriate coefficients. For example, 
the estimate of 11122 would he —_/ -B- 6+ D+ 8, + AB+ 
AC — AD + BC — BD — CD AE, BR, CE, DE, . 

By the nature of the mathematical model and in the absence of 
three-factor interactions, the estimates are, of course, unbiased, in the 
2* X 4 experiment. In the 2* X 3 experiment, due to the partial con- 
founding, estimates of the missing observations obtained in the tra- 


a. : 
13 
| 
D 
AB, CD, AC E og 
i 
, BD, AD ee 
, and B ; 
| 
DE 
= 
ae: 
F 
| 
| 
ee 
. 


14 BIOMETRICS, MARCI 1956 


ditional manner, are biased. In fact, it can he easily shown that in this 
design the bias in AB is — CD/3, int AC, — BD/3 and so on for the 
other interactions involved in the partial confounding. Unbiased 
estimates of the missing observations can be formed but these will not 
be discussed here. 

It will be interesting to investigate the precision of the estimates. 
First observe that if the variance of an observation is o”, the variance of 
an estimated effect is in general o°/n, for the estimate of an effect is 
equal to 1/n multiplied by a sum of n independent variates. It can be 
seen in Table I that in the 2* X 4 half replicate the estimates of effects 
are orthogonal and so the variance of an estimate of a missing effect is 
easily found. It would be the sum of the variances of the effects, which 
for the 2* X 4 experiment would be 


When the estimates are correlated, as in the 2* X 3 experiment, 
it is a little more complicated to find the variance of an estimate of a 
missing observation. However, a glance at Table I shows that if the 
observations are divided into three groups on the basis of the last sub- 
script, and if the portions of the estimates associated with these groups 
are designated y, , y2 , and y; , then two two-factor interactions which 
are confounded will differ in y, , y, and y, by only two signs. For 
example, if we write . 


AB + 42+ 4s, 
then 

CD = -y 
where 


y, = —(21111) = +(11112) = —(21113) 


—(12111) +-(22112) — (12113) 
+(11211) — (21212) +(11213) 
+(11121) — (21122) +(11123) 
+-(22211) —(12212) + (22213) 
+(22121) (12122 + (22123) 
— (21221) + (11222) — (21223) 


— (12221) — (22222) — (12223) 


TARIFF TV 


| 
| 
3 
| 
| 


TABLE IV 


Estimates of Missing Observations in the 24 X 3 Experiment 
To estimate a missing observation, find the subscripts of the observation in the top row and associate the signs in the column under the subscripts with the effects in the column on the 


extreme left. 


11111 22111 21211 21121 21112 12211 12121 12112 11221 11212 11122 22221 22212 22122 21222 12222 11113 22113 21213 21123 12213 12123 11223 22223 


Effect 


B; 


+ 
1° 
1°e 
© 
+ 
+ 


~ 


(RE); 


(CE): 


(CE): 


| 

| 

(a 


16 BIOMETRICS, MARCH 1956 
We shall need eov (AB, CD) 
cov (AB, CD) = cov +. 3 — — 
(yi) = = (ys) = (24? 72 
Hence cov (AB, CD) = —@/i2 
Similarly cov (AC, BD) = cov (AD, BC) = — o°/72. 
Table IV gives all the estimates of the missing observations. 
: The variance of an estimate is equal to the sum of the variances 
| of the effects plus the sum of the covariances of the effects with the 
: proper sign prefixed. It will follow then that the estimates will have 
different variances depending on how the signs of the terms which make 
up AB and CD, AC and BD, AD and BC agree or disagree. Table V 
i gives the variances of the estimates of all possible types of missing 
observations. 
; It can be seen that although the variances differ, they differ by 
: very little, and this is likely to be the case for higher order factorials 
4 and when the factors are at a greater number of levels. 


TABLE V 


Variances of the Estimates of all Possible Types of Missing Observations 
X 3 Half-Replication) 


Condition of signs 
within the sets 
~ ~ Example 
AB and CD, missing Variance 
AG and BD observation 
from 
AD and BC Table IV 
All three sets 21211 Ps Ps 45 
agree in sign + 
All three sets 11212 1o( 4 3 4 sl 
~ “\~ 72] 72° 
Two of the sets 22111 49 , 
set agrees 
One of the sets 12112 2 47 
disagrees, and two 16\54) — + = 75% 
24 12 2 (2 
agree 


i 
4 
| | 


FRACTIONAL REPLICATION 17 


TABLE VI 
Experimental Data and Estimates for 2* X 3 Experiment 

Dependent Variable Dependent Variable 

Gon Observed | Estimated ton Observed Estimated 

11113 5.30 4.52 21113 6.05 

11123 6.65 21123 7.65 7.96 

11112 4.60 21112 3.95 4.10 

11122 5.75 5.68 21122 4.95 

11111 4.90 4.70 21111 1.60 

11121 6.05 21121 1.80 3.38 

11213 20.15 21213 18.85 19.00 

11223 28.15 26.36 21223 25.00 

11212 19.80 20.12 21212 16.80 

11222 26.70 21222 22.70 23.14 

11211 19.20 21211 15.20 14.80 

11221 25.75 24.40 21221 20.45 

12113 5.85 22113 6.00 5.78 

12123 7.55 8.10 22123 7.80 

12112 4.70 5.86 22112 3.60 

12122 5.95 22122 4.45 3.44 

12111 4.40 22111 0.60 —.12 

12121 5.85 4.44 22121 0.40 

12213 17.30 17.10 22213 14.35 

12223 23.45 22223 18.70 20.46 

12212 14.70 22212 11.25 9.60 

12222 20.65 20.30 22222 14.95 

12211 12.50 13.82 22211 8.05 

12221 17.75 22221 10.05 12.42 

6. Example 


Experimental Towing Tank data were available in which five 
independent variables, termed A, B, C, D and E, were involved. A, B, 
C, and D were each at two levels and E at three levels. For this 2* x 3 
experiment all possible 48 data points had been obtained experimentally, 
and the experimental error had been determined from 30 repeat runs 
to equal .19 (i.e. 6? = .19). It was decided to set up an estimation design 
for this experiment and to compare the results of the half-replication 
with that of the full-replication. The design for the 2* X 3 experiment 
has already been formed. ‘Table VI shows the full set of data points 


- 
| 
| 
3 
4 
a 
4 
3 
KE 
22 
i 


18 


BIOMETRICS, MARCH 1956 


TABLE Vila 
Analysis of Variance, Full Replication 2 X 3 Experiment 
Source 8. S. d. f m. 8 F 
A 97.61 1 97.61 212.2* 
B 124.00 1 124.00 269 .5* 
Cc 2214.76 1 2214.76 4814.8* 
D 131.50 1 131.50 285 .8* 
E 129.06 2 64.53 140.2* 
AB 3.23 1 3.23 7.2" 
AC 20.09 1 20.09 43 .6* 
AD 4.24 1 4.24 9.2* 
BC 110.87 1 110.87 241.0* 
BD 2.50 1 2.50 5.4* 
CD 58.20 1 58.20 126.5* 
AE 25.58 2 12.79 27 .8* 
BE 10.80 2 5.40 11..7* 
CE 3.05 2 1.52 3.3 
DE 2.80 2 1.40 3.0 
Residual 12.45 27 * 
Total S. S. 2950.74 47 62.78 
*Indicates significance at the 5% level. 
TABLE VIIb 
Analysis of Variance, Half-Replication 24 X 3 Experiment 
Source 8.8. d. f m. 8 F 
A 41.21 1 41.21 50.3* 
B 56.89 1 56.89 69.4* 
C 1115.89 1 1115.89 1361.4* 
D 69.19 1 69.19 84.4* 
E 61.77 2 30.88 
AE 12.50 2 6.25 7.6* 
BE 7.07 2 3.53 4.3* 
CE 0.86 2 0.43 0.5 
DE 1.73 2 0.86 1.0 
AB, AC, AD, 
BC, BD, CD 93.46 6 15.58 19.0* 
Residual §. S. 2.46 3 .82 
Total 1463.03 23 63.61 
*Indicates sig at the 5% level. 


| 
| | 
4 
: 
a 
| | 
4 
3 


FRACTIONAL REPLICATION 19 


and points estimated under the design. Comparisons of observed and 
estimated points should be made in the light of ¢ = 45, and an error of 
estimation of approximately 2/3 é? = .37. Also, since some of the 
effects are partially confounded, each estimate has a bias. The standard 
deviation of the difference between an observation and an estimate is 
.58. For the 24 points estimated, the maximum discrepancy is equal 
to about 4 standard deviations (of a difference). 

It may be noted (Tables VIIa, VIIb) that at the 5% level of sig- 
nificance the results of the half- and full-replicate agree. At the 1% 
level, the results on the BE interaction would differ. In general, of 
course, a residual sum of squares carrying only three degrees of 
freedom is not a good denominator for the F-ratio. In much engineering 
work, the only estimate of error which is trusted is determined from 
repeat runs and, if possible, it is best to use an estimate thus obtained 
rather than the residual. If the error estimate obtained from repeat 
runs (é7 = .19) had been used in the two analyses of variance, only 
the CE interaction in the analysis for the half-replication would not 
have shown as significant at the 5% level. It is also noteworthy that 
in both analyses the ratios of the residual mean squares to é” = .19 
indicate the presence of some three-factor interactions. 


REFERENCE 


1. Kempthorne, Oscar. The Design and Analysis of Experiments. John Wiley and 
Sons, Inc., New York. 1952. 


iq 
3 
i 
q 
| 
| 
j 
i cae 4 
a 
3 
} 
3 
‘| 


BLOCK EFFECTS IN THE DETERMINATION OF 
OPTIMUM CONDITIONS 


Rosert M. DeBaun 


American Cynamid Company, New York 


The second-order ‘‘central composite’’ designs proposed by Box and 
Wilson (1951) for the determination of optimum levels of a set of 
continuously variable factors consist of a factorial design in full or 
suitable partial replication in the n factors under study plus a cross 
polytope of radius a with 2n points on the exterior of the design 
(radius = a) and gq replications of the central point. In practice, these 
designs tend to arise as follows: The experimenter assigns levels of the 
true experimental variables to the coded levels of the design variables 
(0, + 1, + a), and then conducts a factorial experiment (in full or 
partial replication) at levels X; = + 1, (¢ = 1,2, ---,mn.) This design 
is conducted in a sufficiently large partial replication of the full factorial 
so that a first-order response surface (J) can be fitted to the yield values 
at the X; = + 1 points in the factor space. 


(1) Y = Bo + BX, + BX. + BX; + BX; + --- BX, 


Sufficient points are usually included in the factor space so that at 
least some second-order terms (in B;;X,;X;) can also be estimated. If 
the magnitude of the second-order coefficients is such .that the first- 
order approximation will be insufficient to describe the response surface 
accurately, the points are added at the center 0, 0, 0, --- , 0 and the 
cross-polytope is added with points at + a, 0, 0, --- , 0; 0, + a, 0, 
,0;--- ;0,0,0,---,+ca. The inclusion of the additional points 
permits the fitting of a second-order response surface to the yield data 
(II), from which the response can frequently be characterized adequately. 


(11) Y=Bot+ BX + + 


t=1 i=l 


20 


yy 
1 
n n n 
| 
| 


BLOCK EFFECTS 21 


The following considerations have been found useful in our appli- 
cation of these experimental designs. If in the preliminary block of 
experiments, i.e. the ‘factorial’? stage, one or more replications are 
added at the point 0, 0, --- , 0, (X; = 0), then the mean yield at the 
peripheral points (X; = + 1) less that at the central point estimates 
the sum of the quadratic coefficients (B;,)._ If this value appears large 
relative to the absolute values of the B; estimated, the response surface 
is definitely revealed as curved, and a good estimate of it will require 
the fitting of the full second-order model. If not, it is often profitable 
to move along a line of ‘“‘steepest’’ ascent in the factor space, before 
attempting to fit the full second-order model. 

In adding the cross-polytope, it may be necessary to consider 
“block” effects. Depending on the level of the design variables selected 
for a, the other coefficients, in particular the important B,; , may be 
biased to some extent The bias is determined both by a and by the 
amount of shift in B, . However, if the block effect is such as to alter 
only B, , the nature of the response surface can still be determined 
without bias due to a block effect. 

For example, in a five-factor experiment, the first block might 
consist of the half-replicate of the 2° factorial design (I = ABCDE), 
plus four replications at X; = 0. This block provides one degree of 
freedom for B, , fifteen for the B; and the B;; , one for the summed B,; 
and three for error. If the B;; and the summed B;,; appear small with 
respect to the B; , the approach along the line of “steepest ascent”’ is 
indicated. If it is desired to fit the full second-order model by adding 
the cross-polytope, the possibility of the block effect needs to be con- 
sidered. The expected value for the mean from the first block is 
B, + 4/5 , B,, . The block contrast will be orthogonal to the second- 
order response surface if the expected value of the mean for the block 
containing the cross-polytope is the same. The expected value for this 
mean is By + (2 .a’/n.) >> B,; where ais the radius of the cross-polytope 
and n, is the total number of points it contains. Such a block would be 
the ten points + 2, 0, 0, 0, 0,; --- ; 0,0, 0,0, + 2. The block contrast 
will also be orthogonal to the third-order response terms, although these 
will not be free of the B,; and B,; . 

If effects of order higher than two are suspected on completion 
and analysis of the two blocks, a third block can be run, consisting of 
the other half-replicate of the 2° factorial design plus four more repli- 
cations at X¥; = 0. The B;,; will now be partially confounded with the 
B, , but the B;;, will be free of the second-order response surface and 
of the block contrast. 

In this example, the length of @ is exactly that suitable for a 


| 
AV 
| 
is 
7 
is 
| 
nee 
i 
a 


22 BIOMETRICS, MARCH 1956 


“rotatable”? central composite design (ITunter, 1954). The inelusion of 
additional points at the center (Y, = 0) also enhances the design, as the 
location of the maximum is made more precise by reducing correlation 
among the estimates of the B,, (Box and Ilunter, 1954). 

It can be seen that the inclusion of central points in the various 
blocks that might arise enables the experimenter to balance block 
effects, rotatability and uniform information, subject only to his in- 
genuity in blocking the “factorial” portion of the design. 

A similar, independent treatment of the blocking problem in central 
composite designs is described by Box and Hunter (1955). 


REFERENCES 


Box, G. E. P., and J. S. Hunter, (1954). Biometrika, 41, 190. 

Box, G. E. P., and J. S. Hunter, (1955). Ann. Math. Stat., in press. 

Box, G. E. P., and K. B. Wilson, (1951). J. Roy. Stat. Soc. Series B, 13, 1. 
Hunter, J. S. (1954). Ph.D. Dissertation. North Carolina State College. 


ADJUSTMENT BY COVARIANCE AND CONSEQUENT TESTS 
OF SIGNIFICANCE IN SPLIT-PLOT EXPERIMENTS* 


JEANNE Trrus Truirr** AND H. Farrrretp 


North Carolina State College 


1. INTRODUCTION 


This paper investigates methods for adjusting experimental data by 
covariance on a concomitant variable, and of appropriate ensuing tests 
of significance, when, as in a split-plot experiment, the analysis of 
covariance has two or more'rows for errors at different levels, each 
yielding regressions of the dependent variate on the concomitant 
variable which may be assumed to be equal. Somewhat to our surprise 
when seeking examples six out of nine split-plot field experiments 
examined showed significant differences between the two regressions. 
This suggests that the assumption may not be so generally valid as 
might be expected, but the sole concern of this paper is with what to 
do when the assumption is permissible. 

Anderson (1946), when discussing how to deal with a missing sub- 
plot, suggested the procedure described in see. 3 below, but warned that 
the two mean squares so evaluated would not be independent and the 
F-test therefore not valid. Stimulus for the work here reported derived 
from finding that method being promulgated without Anderson’s 
warning as a general procedure for covariance adjustment in split-plot 
experiments; whence it seemed worth while to endeavour to work out 
just what the procedure implied and the distributions involved. 

Bartlett (1937) stated a different procedure (sec. 4 below) without 
discussion or proof of its validity. Being given in a single sentence in 
a paper dealing with many other matters his statement seems to have 
been generally overlooked and was unknown to us until after we had 
derived his test independently on the grounds discussed below. 

Assume given a split-plot experiment with m main-plot treatments 
in r randomized blocks and with q split-plot treatments in each main- 
plot. Let y be the observed experimental variate, and x be a concomitant 
variable unaffected by treatments and measured by deviations from 
its mean over the whole experiment. Assuming the regression of y on 
x to be the same both between main-plots within blocks and between 


*Sponsored in part by the Office of Ordnance Research, United States Army, 
under contract DA-36-034-ORD-1517. 
**Now at Dayton University. 


23 


| 
abe 
re 
iJ 


24 BIOMETRICS, MARCH 1956 


split-plots within main-plots the usual linear (regression) model is 


= ta; +p; +6; + (ay) ix + BX jx + (1) 


a7=1---m 

— | r 

k=1---q 
= main-plot error NID (0, 
= Split-plot error NID (0, 


= Pi = (avis = (enix = 0 


2;;, are usually regarded as a given set of constants, as in ordinary 
regression analysis, but this viewpoint will be later modified. 


TABLE 1. 
Notation for Sums of Squares and Products 


Source d.f. y? xy 2? 
Replications (r — 1) 
Main treatments va = (m — 1) My i M., = M 
Error D vp = (m — 1)(r — 1) Dd. Di D., = D 
Sub-treatments (q 1) Boy 
Interaction (m — 1)(q 1) SM., SM: 
Error E ve = — 1)(r — 1) Ey, Evy = E 


Table 1 shows the notation which will be used to designate sums of 
squares and products in the analysis of variance and covariance. For 
ease of printing subscripts will be omitted from the three symbols to be 
most frequently used; wherever M, D and FE occur without other 
indication subscripts zz are to be understood; when they occur with a 
superscript, subscripts yy are to be understood. Each row yields an 
estimate of a regression coefficient and the sum of squares of deviations 
about the respective regression. For example by = M,,/M,, ; and 
M* = M,, — buM,, = rq bux;..)” = the sum of 
squares of deviations of the main treatment means, y;.. , about that 
regression. Similarly bp) = D,,/D,, ; and D* = D,, — bpD, = 
q — — — bp (@i;. — — 2.;.)]? = the 
sum of squares of deviations of main-plot means from respective 
members of a set of parallel regressions through a compound of treat- 
ment and block means (2... being zero by definition of x). 


| | | 
| 
| 


SPLIT-PLOT EXPERIMENTS 25 


Treatment effects are defined as differences between mean yields of 
the observed treatment combinations. Main plots, being balanced 
for sub-plots, form simple randomized blocks within which bp provides 
an estimate of 6. Significance of main-treatment effects adjusted for 
x can be tested in the usual way with the “reduced” treatment sum 
of squares (M + D)* — D*. 

The main plots form mr blocks of split-plots but with the addition 
over simple randomized blocks that m groups, of r “blocks” each, are 
““dentifiable’ (Smith, 1955b) by main treatments to yield an estimate 
of the main X sub-treatments interaction. Since we consider only the 
factorial model row LF gives the estimate of error for sub-treatment 
effects, as well as for interaction, and another estimate of 8. Reduced 
treatment and interaction sums of squares follow in the usual way for 
tests of significance. 

These tests are well known and unambiguously valid. The reasons 
for considering alternative procedures are: (1) given the postulate that 
regression is homogeneous at both levels each part of the analysis uses 
only part of the available information on 8. Therefore greater accuracy 
should be possible from combining both parts to obtain a single estimate 
of the coefficient. And (2) results adjusted by different coefficients 
cannot be presented in a two way table for crossed treatments with 
self-consistent means. Although not important this is unpleasing and 
complicates neat presentation of results. 


2. MAXIMUM LIKELIHOOD SOLUTION AND OTHER WEIGHTED 
MEANS OF MAIN- AND SUB-PLOT REGRESSIONS 


The maximum likelihood solution for 8 (Cochran, 1946) is the 
weighted mean of estimates from each error row: 


_ bp D/oi + be E/o3 (2) 
+ 


where o; = o. + qo; = the theoretical error variance for main plots, 
o; = o. = the error variance of sub-plots. In the complete maximum 
likelihood solution the estimates ¢; , ¢; depend on @ ; they can only be 
obtained by iteration and resulting distribution theory is difficult. 
However almost all the information on these variances is contained in 
si = D*/(vp — 1) and s; = E*/(vg — 1); the full maximum likelihood 
solution being equivalent to salvaging only a small fraction of a degree 
of freedom for each. We may therefore be content to accept si , 82 as 
estimates of o; , 02 ; but exact sampling distributions will still be ex- 
cessively difficult to determine if they, random variables, be substituted 
in (2). Little information may be lost, and simplification gained, with 


= 
A 
4 
é 3 


26 BIOMETRICS, MARCH 1956 


an arbitrarily weighted mean, say 
b, = chp + (1 — (3) 


Ife =c, = D/(D + wE), where w = oi/o2 , this becomes the maximum 
likelihood solution with minimum variance for known weights. If w 
be replaced by an arbitrary w, perhaps guessed from previous experience, 
the variance of b, is less than the lesser of var (bp) and var (bz) provided 


0 <c < 2c, , or equivalently ~ > w > (wH — D)/2E>0 
or (2c, — 1) < ¢ < 1, or equivalently 2wD/(D — wk) >w>0 


Let R stand for the capital letter in any single row of table 1. If 
we form an adjusted sum of squares with any arbitrary or estimated 
coefficient b’ 


R* = R,, — 20'R,, + b”R., (4) 
= R* + R,.(be — 0’)? (5) 
Under the null hypothesis of no treatment effects 
&(R*)/(v — 1) = R..8(be — = 


where » is the degrees of freedom in row R, andi = 1 or 2 according as 
R is in the upper or lower part of table 1. Also R* and bez are inde- 
pendent. Therefore if b’ with expectation 8 be also normally distributed 
independently of R*, then R* is distributed as the weighted sum of 
two independent x’, namely 


+ xiR.,. var (be — b’) (6) 


If R represents a treatment row, say main-treatmerts, and if treat- 
ments have an effect, 


The first term on the right is a treatment contrast which we want to 
retain in the treatment sum of squares. The purpose of adjusting is to 
estimate it ‘decontaminated’ from 8. We would like to do this in such 
a way that the total adjusted sum of squares will have a manageable 
distribution and be suitable for testing against an independent error 
mean square from another row or rows of the analysis. Since the 
distribution of a sum of x”’s with unequal weights is intractable the only 
simple solution is to make M* proportional to a homogeneous x’. 
Since (5), when R = M, inevitably contains M*, (6) shows that that 
can be done only if we can find a b’ and weight k such that 


kM,, var (by — 0’) = 3 (7) 


| 


SPLIT-PLOT EXPERIMENTS 27 


Using a weighted mean of the two error regressions, 6, , we have 


and to meet condition (7) k must be the reciprocal of the factor on the 
right. In ignorance of w it can be known exactly only if c = 1; that is, 
we use bp alone with k = D/(M + D). This leads to the usual reduced 
sum of squares; or is equivalent to using 


v= (1 +D + 
so that (by — b’) is proportional to (by — bp) and 
M var (by — 0b’) = Wop (by — bp) = a4 
If we usec = D/(D + wE) fora ve w, and 


then 


kM var (by — bz) = (142 a (8) 
This is too small if c, < c < 1 (the minimum is too complicated to be 
worth reporting) and is too large when c < c, , reaching a maximum 
(for 0 <c < 1) atc = 0. The discrepancy is then o; M/wE. If w is 
a good guess for w, or even if it is bad but £ is large, the discrepancy 
may not seriously affect tests made as if w were known. However if 
only one estimated regression is to be used for all adjustments most 
workers may prefer the simpler approach of section 4. 
The error variance of a difference between two means adjusted by 
b, is fairly obvious. Although its estimate involves s; that component 
will usually be so small a part of the total that the estimate can be taken 
as based on the degrees of freedom associated with sj . Converse 
arguments apply for contrasts among split plots. 


3. ANDERSON’S METHOD 


Using covariance on dummy variables for missing plots Anderson 
(1946) noted that bp , the estimate for a missing sub-plot which minimizes 
D*, neglects all information on yield of the partly missing main-plot 
which is contained in its existing (q — 1) sub-plots. He therefore recom- 
mended completing the main-plot yields with the sub-plot estimate, 


> 
1) 
| 
| 
| 
a 
| 
| 
af. 
| 
| 


28 BIOMETRICS, MARCH 1956 


be ; and suggested that appropriate reduced sums of squares for main- 
treatments and error, which would be unbiased in the sense that they 
i. + are freed of 8, might be 


M° = (M + E)* — E* 
D® = (D + E)* — E* 


He noted that these sums of squares are not independent and so would 
not yield a valid F test but did not investigate the amount oi disturbance. 

If one wishes to use only one estimate of 8, and to avoid complexities 
associated with a compound of bp and bg decides to use just one of them, 
one naturally chooses that one which is more accurately evaluated. 
Several authors have pointed out that usually var (bz) < var (bp), 
that is 03/E < o;/D, because o; < o; and usually E > D owing to the 
larger number of degrees of freedom associated with E although the 
mean squares may be in reverse order. We assume this to be so and 
that by is to be used throughout. Converse arguments apply in the 
exceptional case of bp being more accurate. 

Using bz the tests of significance for sub-treatments and interaction 
obviously go through in the usual way. For main-treatments it has 
been proposed to use Anderson’s method as a general rule There are 
three defects. 


ME 


2 
M = M* += M + (ba bz) (9) 
(10) 
D+E*°? 


M* and D* are independent of all three regression coefficients, and, in 
absence of treatment effects, have expectations 


&(M*)/(m — 2) = &(D*)/[(m — Ir — 1) 1] = of = gos 
Writing F to stand for 7 or D 


2 
RE &(Dp = _ Eo} a Ro: Eqo; 


+o 


R+E R+#H R+E 
Consequently expectations of the mean squares are 
&(M°) _ M ) 
m—l1 (m 1)(M + + (11) 
&(D°) ( D 


| 
é 
= 


SPLIT-PLOT EXPERIMENTS 29 
These are not equal (except for special values of M, D and F) unless 
os = 0. In the special case studied by Anderson, with mrqg = N, 

M = (m — 1)/N 

D = (m — 1)(r — 1)/N 

E = mq — 1)¢ — 1)/N (13) 
Whence the discrepancy is only 


—(m — 1)(r — 2)qo5 
(m — 1)°(r — 2) + vg(vg + mr — 1) 


The absolute discrepancy must be less than go;/(m — 1) and will usually 
be much less. When z is randomly distributed it will usually be negative 
(that is &(treatment m.sq.) < &(error m.sq.)), but may be > 0 if 

YM Vp VE 

(This has happened in both examples 1 and 2, sec. 7.) 

Secondly, the sums of squares are distributed as the weighted sum 

of two independent x’, namely as 


+ xi(ot — (14) 


The discrepancy from homogeneous xe; may not be serious if 0% is 
small or if E is large relative to M and D. 

Thirdly the x; terms in the two sums are correlated owing to both 
containing b; . Their correlation coefficient is equal to the square of 
the correlation between (b,, — bg) and (bp — be) and is: 


MD 


(M + + ob) (15) 
The overall correlation of the two mean squares is: 
MD((M + — Io” + (M + (16) 


+ — Dw? + (D + 


All three disturbances will be small if F is large relative to M and 
D; and having further regard to the condition that they operate only 
on the square associated with a single degree of freedom in each row 
their overall disturbance of the F test is likely to be trivial, possibly 
no worse than the effect of departures from normality in the distribution 
of observatiois. Nevertheless most workers may feel that the foun- 


| 
i 
{ 
So 
oo: 
= 
| 
fe 
. 
| 
| 
“HS 
13 
q 
. 


30 BIOMETRICS, MARCI 1956 


dation for the procedure is unsatisfactory and may prefer the method 
of section 4. 


4. BARTLETT'S METHOD 


Except for desire to gain additional accuracy one would usually have 
no hesitation in analyzing the results of an experiment without reference 
to a concomitant variable. Indeed one regularly does so either because 
one has not thought the effect of a variable x to have been sufficiently 
great to be worth evaluating or because observations of it have not been 
recorded. In other words the effects of x are lumped with experimental 
error with the assumption that they have been randomized with treat- 
ments as for all other components of plot error. Alternatively, knowing 
x, one may adjust on theoretical considerations, on a priori experience, 
or simply on a guess. An example of the last is to analyze the differences 
of some character after and before the application of treatments, 
equivalent to assuming b = 1 for the regression of final on preliminary 
yields. 

Suppose the true model to be as in equation (1) and that we adjust 
yields on the basis of an arbitrary regression coefficient b) . (No ad- 
justment is included by putting b}) = 0). The analysis of variance 
may be done directly on adjusted yields, 2;;. = Yyij, — boxt:;, ; or be 
derived from covariance analysis on the original data as in table 1, 
adjusting each row by 


R,. = Ry, — + 


Now b, , being independent of main-plot rows of the analysis, can be 
introduced into them as by, in the above argument. Valid tests follow 
provided that x,;, have been associated at random with treatments. 

For those who consider it improper to treat x as a random variable 
(a point of view for which we are indebted to a referee) we can regard 
(bo — @8)x,;. as fixed arbitrary deviations which become a part of the 
deviations of z;;, about which no postulate need be made except that 
they are randomized with treatments. ‘The potential distribution of 
individual mean squares will not then {>llow a continuous standard 
form, but randomization in many experiments (with different sets of x) 
will generate a distribution of their ratios which is approximately that 
of F (Pearson, 1937, and associated papers). 

However, in the same way that it is convenient to postulate y as a 
normally distributed random variable, whether accepting this as a basic 
postulate or as a device to simplify the argument on the grounds that 
it leads to the same approximation and has been empirically justified 
by experience, one may postulate as part of the model that x is also 


a 
4 


SPLIT-PLOT EXPERIMENTS 31 


a normally distributed random variable with structure 
tin = + di; + Cin 
where d,; are NID(O, are NID(O, o2), and 
&(M)/(m — 1) = &(D)/vp = + = 
&(S..)/(q — 1) = &(SM.,)/(q — 1)(m — 1) = /vg = = 


The expectations of adjusted y mean squares are then as usual except 
that error components o; have to be replaced by o: + (bo — 8)” oiz ; 
and on the null hypothesis sums of squares are independently distributed 
as + (bo — oiz). 

If by be replaced by b,; the argument applies only to the main plot 
analysis. To test sub-plot effects we must fall back on the usual reduced 
sum of squares. On observing that the main plot analysis might be so 
treated we felt that some question might be raised about the propriety 
of treating x as a random variable, or as a set of randomly allotted 
deviations, while still applying to sub-plots a regression analysis which 
usually regards x as given constants. We therefore asked: What then 
is the appropriate test for sub-treatments when assuming x to be a 
random variable? This may be answered along well known lines by 
saying that if we demarcate critical regions with probability a for each 
conditional distribution given a set of x, the composite critical region 
obtained by integrating these over all sets of x will still have probability 
a for repetitions of the experiment.* However Sir Ronald Fisher 
(personal communication) would say that, in formulating a test of sig- 


*Under the null hypothesis the conditional distributions for given z are independent of z and the 
result for control of type I error follows immediately. But under the non-null hypothesis the means of 
linear functions, whose squares form partitions of the respective sums of squares, are functions of the 
treatment constants and of means of zx. Integration to obtain the marginal distributions and thence the 
average power of such tests seems impracticable. However while investigating this approach (before 
communicating with Sir Ronald Fisher and reading Barnard, 1950) we noted: (1) That in any given 
experiment the set of z may be regarded as ancillary statistics in the sense of Fisher indicating the 
accuracy of estimates obtainable from that particular experiment (as contrasted with the average 
accuracy over experiments in general). (2) That the theoretical average power over many experiments 
for some postulated distribution of z could be of interest in planning experiments but would be of little 
practical use. What it might tell about potential accuracy and thence about the size of experiment 
required to achieve a stated power would usually be trivial relative to uncertainty about error variances 
and regressions which would in fact be found. The distribution of x would usually be neither prescribable 
nor at our disposal. If the latter, we would plan to equalize all treatment means of x so that treatment 
sums of squares for x would vanish, the distribution of treatment contrasts would be again independent 
of z, and the non-central parameter in the distribution of treatment sum of squares would have its 
maximum value, proportional to }\a;?, since no adjustments would be required. 

Barnard (1950) reached similar conclusions: ‘Probabilities are relevant before an experiment has 
been performed, when we are planning it. After the experiment has been performed, when we are 
drawing conclusions, likelihoods are relevant. As a theory based jon ‘probabilities, the'‘Neyman-Pearson 
theory is useful in planning, before the result is known; but after the result is known, the theory‘ of 
likelihood should be used.” 


oid 
ips 
ve 
| 
| 
BS: 
f 
S 


32 BIOMETRICS, MARCH 1956 


nificance, to ask for the frequency of occurrence of an event in repeated 
samples would be the wrong question. On re-reading the paper by 
Barnard (1950) we feel obliged to agree. 

Fisher’s viewpoint seems to have been first expressed in his paper of 
1922, where he discusses K. Pearson’s consideration of the effect on 
regressions of sampling fluctuation of the numbers of observations in 
each array of a correlation table. Fisher wrote: ‘The difference in 
principle is of some importance, since the simplicity of many of the 
results here obtained is a consequence of the fact that we have not 
attempted to eliminate known quantities, given by the sample, from the 
distribution formulae of the statistics studied, but only the unknown 
quantities— parameters of the population from which the sample was 
drawn—which have to be estimated somewhat inexactly from the given 
sample...”’. “This mixed distribution [for means of arrays of different 
sizes] need not concern us, however, for in applying tests of fitness we 
do not in practice ignore the size of the array.” 

Barnard concludes: “. . . the arbitrary nature of the reference set 
involved, on the Neyman-Pearson theory, ina test of significance, is a 
decisive reason for rejecting that theory, as a theory of inference, in 
favour of using a theory of inference, such as that given by Fisher, 
where the idea of a reference set does not enter.” 

We conclude therefore that justification for Bartlett’s procedure 
rests best on fiducial inference. Since this requires no postulate for the 
reference set of sets of x which might appear on repetitions of the 
experiment, alternative assumptions about x make no difference to the 
appropriate test for sub-treatments. 


5. APPLICATION TO MISSING SUB-PLOTS 


The argument of section 4 fails to carry over to analysis of main- 
plots when estimates (regression coefficients on dummy variables) are 
inserted for missing sub-plots. Even supposing that the missing sub- 
plot or plots occur at random, which will often not be true, it does not 
seem possible under any circumstances to regard the dummy variables 
as random. The sum of squares for any one of them in row R is identi- 
cally vz/N where vz is the degrees of freedom for the row and N = rmq. 
The dummy variables must therefore be treated as arbitrary constants 
throughout. 


We consider in detail only a single missing sub-plot for which we 
put « = —1. For all other plots « = 0. The estimate of the missing 
yield is by . Analysis of variance, using bg as if observed, is then 
equivalent to evaluating adjusted sums of squares as in sec. 4. We 
can now take either of two points of view: (1) bg is a random variable 


Ay 


SPLIT-PLOT EXPERIMENTS 33 


with variance o3/E = o2N/v, ; or (2) that with respect to the main-plot 
analysis it is, as in sec. 4, a constant. 


On viewpoint (1) a main-plot sum of squares is distributed as 
2 
xr-101 + var (be — be) = + xi( + tert) (17) 


On view (2) it is distributed as non-central x’ with parameter (bg — 8)? 
ve/N whose average value (over many experiments) should be o3vz/vz . 
Although these distributions are slightly different presumably the 
quasi-F' ratios should have the same distribution. On either view the 
expectations.of mean squares on the null hypothesis are equal for all 
rows. 

Comparison to Anderson’s method is most easily made on view (1). 
The difference lies in the multiplier of (bg — bz)” which is now, by equa- 
tion (5), R,, in place of R,,L,,/(R + E£),, as in equations (9) and (10). 
The consequences are: (i) The null expectations of mean squares are 
equal for main treatments and error; instead of being slightly different 
as in equations (11) (12). (ii) We still have weighted x’ distributions, 
the factor with xj being now slightly inflated by a factor proportional 
to a. , (17), whereas formerly it was deflated by a factor proportional 
to qos , (14). (iii) The correlation of the xi terms remains the same as 
formerly; but their multiplier being now slightly larger, namely o2 
(R,, + wH,,)/E,, in (17) instead of 03(R,, + wE,,.)/(Riz + as in 
(14), the effect will be a shade more serious. As contrasted with (16) 
the overall correlation of treatment and error mean squares now has 
E” in place of (M + E)’ and (D + E)’ in the denominator of (16). These 
can of course here be evaluated in terms of degrees of freedom but the 
expression is too complex to be worth bothering about. Numerical 
values are given for example 3 in section 7. 

On balance we may prefer to use for the main-plot analysis the 
adjusted mean squares (sec. 4) rather than the reduced ones (sec. 3) 
on the grounds of maintaining equalized null expectations for the mean 
squares and easier computation. 

With several, say k, missing sub-plots and estimates b; ,7 = 1 --- k, 
we obtain on view (2) a non-central parameter proportional to 
(>>, (b, — B,)) and the additional complication that expectations of 
treatment and error mean squares may not be identically equal since 
the sums of products M,,,, , D,,.; do not remain proportional to the 
respective degrees of freedom unless all missing plots are in the same 
replication. The disturbance may be expected to be comparatively 
trivial. 


a 
Poe 
AS 
ere 
| 
> 
= 


34 BIOMETRICS, MARCH 1956 


6. POWER 


For m treatments with effects a; , in r replications with variance o” 
per unit plot, the criterion for entering tables of the power of analysis 
of variance tests as prepared by Tang (1938) is ¢” = r > a®/mo®. For 
a straight main plot analysis in above notation this is rq >. a?/moi = K, 
say. After adjusting by covariance using bp the non-central parameter 
of the reduced sum of squares becomes reduced so that the criterion is 


M+E 


If treatments are randomized with zx so that the average correlation of 
z;.. and a; is zero, the average value of (18) is (Smith, 1955a) 


K(1 (19) 


By Bartlett’s method the non-central parameter is retained proportional 
to >. a; but error variance is inflated leading to 


K[1 + (bs — (20) 


The average value of (bz — 8)” is o2/E and the expected value of £ is 
e022 ; therefore (20) is asymptotically 


K(1 + (21) 


where w = 03/02 , #: = oiz/022. If we suppose that the ratios w and w, 
may be approximately equal (21) becomes approximately 


K(1 — 1/vz) (22) 


There is also one more degree of freedom for the estimate of error. 
Bartlett’s method may therefore be expected to have slightly more 
power on the average; but equations (18) (20) show that the comparison 
is susceptible to the z,;..a; combinations, actual error in estimating 
bz , and the distribution of z, so that any individual case may show 
appreciable shifts in either direction. 

An alternative method of comparison is to compare the average 
variance for a difference between two treatment means. Adjusting 
by bp this is (Cochran, 1940; Finney, 1946) 


20; M.. 20% 


By Bartlett’s method it is 


2 
20; 


rq 


+ (bs — > — (1 + (24) 


i 
4 
{ 
| 
7 
a 


SPLIT-PLOT EXPERI MENTS 35 


leading to a comparison very similar to the previous one. The efficiency 
relative to adjusting by bp may therefore be expressed as 


1 + w,/wrg + 3) 
where the last factor allows for the extra degree of freedom in the 
estimate of error (Fisher, 1935, sec. 74). 


Perhaps the main advantages of Bartlett’s method, rather than 


increased efficiency, are simplicity and agreement of sub- and main-plot 
means. 


(25) 


7. EXAMPLES 


In the following examples we give estimates of the disturbances 
(discrepancy in expectations of treatment and error mean squares on 
the null hypothesis, and correlation of the mean squares) by substituting 
estimates of parameters as follows: 


8; = reduced D mean square = D*/(vp — 1) 

8; = reduced E mean square = E*/(vg — 1) 
=si—8, = veD/vE. 
s4 = est + (be — = D*/vp 


where D* has been adjusted by bg . Using this estimate of si an 
empirical estimate of the relative efficiency of Bartlett’s method, 
corresponding to the ratio of the first parts of (23): (24) multiplied by 
Fisher’s factor for the information from the extra degree of freedom, is: 


20) 


This represents the ratio of variances of treatment contrasts as they 
would actually be computed in a particular case. However both s}/s 
and M/D may be rather erratic: for example, in example 1, si/s4 is 
greater than the theoretical maximum for o}/o4 < 1 owing to (bp — bz)’ 
happening to be less than s;/D; and, in both examples 1 and 2, M/D is 
less than the expected ratio 1/(r — 1). Therefore (25), with @ and a, 
estimated from the data, may be a more stable criterion to represent 
average efficiency for a given type of experiment. 

The relative discrepancy of expected mean squares (on the null 
hypothesis) in Anderson’s method is measured as equation (11)— 
equation (12) (substituting above estimate for qgo3) divided by the error 
mean square as evaluated for the same method. 

Example 1. Bartlett (1937) illustrates on yields of a cotton experiment 


4 
Las 
1+ 1/yp 2 
AS 
: 1 
Ag 
me; 
a 
J 
a 
; 
be: 
{ 


a 


36 


BIOMETRICS, MARCH 1956 


with eye estimates of salt accumulation in the soil as concomitant 
variable. The relevant parts of the analysis of covariance are: 


8. Sq. and Prod. Re- 
Source df. duced b var(b) 

y? xy x? M.Sq. 

Main 

treatments 5 | 32.210 | — 29.068 | 31.20] 1.561 
Error D 10 | 97.807 | — 78.005 | 128.49 | 5.606 | —.6071 | .04363 
Error E 84 | 240.754 | —191.835 | 279.33 | 1.313 | —.6868 | .00470 

& = 4.268 &: = 3.864 


The main- and sub-plot regressions are plainly similar and the sub-plot 
coefficient appears nine times more accurate than that for main plots. 
Comparison of results of the different methods is as follows: 


Reduced by 
d.f. bp Bartlett Anderson 

Main treatment mean 

square 5 1.561 1.405 1.368 
Main-plot error variance ( ) (9) 5.606 (10) 5.127 (10) 5.101 
F .278 .274 . 268 
Relative efficiency (26) 1.16 
Relative efficiency (25) 1.088 
Relative discrepancy of 

mean squares 0.96% 
Correlation of x?’s 
Correlation of mean squares .00025 


Example 2. The following table gives the relevant parts of analysis of 


8. Sq. and Prod. Re- 
Source d.f. *| duced b var(b) 
y? zy M.Sq 
Main treatments 3 | 10,433 | 2,469 | 6,591 | 3,181 
Error D 9} 6,916 | 9,729 | 32,902 504.9 | .2957 | .01535 
Error E 132 | 64,044 | 37,419 | 75,595 347.5 | .4950 | .00460 
& = 1.403 @: = 6.384 


| | 
: | 
1 
4 


SPLIT-PLOT EXPERIMENTS 37 


covariance for a fertilizer experiment on orange trees at Riverside, 
California. It was observed for 12 successive years and years are 
treated as the sub-treatments. The concomitant observations are 
yields of paired check plots (pounds per tree per annum). 

The difference in error regression coefficients, .199 + .141, is non- 
significant and the accuracy of bg appears more than three times that 
of bp . The tests for significance of fertilizer treatments appear as 
follows: 


Reduced by 
d.f. bp Bartlett Anderson 

Main treatment mean 

square 3 3181 3201 3199 
Main-plot error variance c ) (8) 504.9 (9) 594.0 (9) 550.0 
F 6.30 5.39 5.82 
Relative efficiency (26) .949 
Relative efficiency (25) 1.095 
Relative discrepancy of 

mean squares 0.20% 
Correlation of x?’s .013 
Correlation of mean squares .0023 


Example 3. Anderson’s (1946) example of a missing sub-plot gives 
relevant parts of the analysis of covariance as follows. In this method 
of analysis we can insert any arbitrary value for the missing plot, say 
Yo , and then the regression coefficient b; estimates the difference between 
Yo and the value which would be estimated in the usual way. To save 
recomputation we have imagined y, ‘guessed’ as Anderson’s estimated 
value, 763, so that sums of squares for y can be taken from his table: 
this of course leads to by less than a half. 


|Redueed| 
Source df. y? x? M.Sq. b var(b) 


Main treat- 


ments 3 | 1031784 | —118.7344 | 3/64 
Error D 9 80644 | — 41.3881 9/64 | 8562.27 | —293.89 | 60887 
Error E 36 151950 | — 0.1875 | 36/64 | 4341.43 | — 0.33 | 7718 


The estimates of the missing plot from sub-plot and main-plot error 
rows, namely 763 and 469 appear superficially different; but the difference 


i 

ad 

ways 
id 
: 
| | | 
| 
t 
‘ | 


38 BIOMETRICS, MARCH 1956 


294, only slightly exceeds its standard error, (68605)' = 262. The 
sub-plot estimate appears nearly eight times as accurate as the main- 
plot estimate. The alternative tests of significance for main-treatments 
work out as follows: . 


Reduced by 
d.f. bp Bartlett Anderson 

Main treatment 

mean square 3 302396 343902 336192 
Main-plot error 

variance ( ) (8) 8562 (9) 8957 (9) 8687 
F 35.32 38.39 (9) 38.70 
Relative discrepancy 

of mean squares 0 —0.17% 
Correlation of x?’s .0046 .0046 
Correlation of mean 

squares .00083 .00064 

SUMMARY 


The paper examines some methods which have been proposed for 
covariance adjustments in split-plot experiments when it may be 
assumed that the regressions appropriate to sub-plot and main-plot 
comparisons are equal. Theoretically the most efficient analysis 
should be given by estimating the regression coefficient by a weighted 
mean of those indicated by the main-plot error and sub-plot error rows 
of the analysis of covariance. But this would usually be too troublesome. 

Frequently the sub-plot error estimate is considerably the more 
accurate of the two and one may wish to use it for adjustment of main-, 
as well as of sub-treatments. When this is done two methods of testing 
the significance of main-treatments have been proposed. One was 
tentatively proposed by Anderson (1946) for analyses with missing 
sub-plots, but has been used by other workers for adjustment on an 
observed concomitant variable. There are theoretical objections: the 
expectations of treatment and error mean squares are not identically 
equal when the null hypothesis is true, the reduced sums of squares 
are distributed as the weighted sum of two x”’s instead of as simple 
x’’s, and they are correlated. The disturbances to the theoretical 
conditions for an exact F test are however so small as to be trivial in 
practice. 

Bartlett (1937) proposed to adjust all main-plot sums of squares 
using the sub-plot regression as an arbitrary correction independently 


a 
i 


SPLIT-PLOT EXPERIMENTS 39 


evaluated. This is shown to lead to valid tests of significance provided 
the treatments have been randomized with respect to the concomitant 
variable. It is not theoretically valid for missing sub-plots. ‘It does 
maintain equal expectations for the null mean squares; but, in this 
application, still has the other two defects of the Anderson method— 
again trivial. 

The gain in efficiency from using the sub-plot regression instead of 
the main-plot regression is about 9 per cent in two examples examined. 
Perhaps the chief advantage lies in maintaining consistency between 
adjusted sub-plot and main-plot means, thus simplifying presentation 
of results. When this is to be done the Bartlett procedure is recom- 
mended, being computationally simpler and having better theoretical 
validity. 

Although both split-plot experiments and experiments with co- 
variance are very common, we-found it surprisingly difficult to find 
examples of split-plots with covariance. Out of nine found six showed 
significant differences between the main-plot and sub-plot regression 
coefficients, suggesting that to assume equality may not be as generally 
valid as one would expect. 


REFERENCES 


Anderson, R. L. (1946). Missing-plot techniques. Biometrics Bull., 2, 41-47. 

Barnard, G. A. (1950). On the Fisher-Behrens test. Biometrika, 37, 203-207. 

Bartlett, M. S. (1937). Some examples of statistical methods of research in agricul- 
ture and applied biology. J. Roy. Stat. Soc. Suppl., 4, 137-170. 

Cochran, W. G. (1940). Analysis of lattice and triple lattice experiments. II. 
Mathematical theory. Jowa Agr. Expt. Sta. Res. Bull., 281, 64-65. 

Cochran, W. G. (1946). Analysis of covariance. Univ. North Carolina, Inst. Stat. 
Mimeo. Series No. 6. 

Finney, D. J. (1946). Standard errors of yields adjusted for regression on an inde- 
pendent measurement. Biometrics Bull., 2, 53-55. 

Fisher, R. A. (1922). The goodness of fit of regression formulae and the distribution 
of regression coefficients. J. Roy. Stat. Soc., 85, 597-612. 

Fisher, R. A. (1935). The design of experiments. Oliver and Boyd: Edinburgh. 

Pearson, E. S. (1937). Some aspects of the problem of randomization. Biometrika, 
29, 53-64. 

Smith, H. F. (1955a). Tests of significance in analysis of covariance and some related 
regression techniques. 

Smith, H. F. (1955b). Variance components, finite populations and experimental 
inference. Univ. North Carolina, Inst. Stat. Mimeo Series. No. 135. 

Tang, P. C. (1938). The power function of the analysis of variance tests. Stat. Res. 
Mem., 2, 126-149. 


aa 
| 
2) 
i 
7 a 
be 
See 
- 


A NOTE ON THE COMBINATION OF ESTIMATES OF 
RELATIVE POTENCY IN MULTIPLE ASSAYS* 


PaMELA M. CLARKE 
National Institute for Research in Dairying, Shinfield, England 


A problem of frequent occurrence in the field of biological assay 
is that of combining several estimates of the potency of a substance. 
Such estimates may be obtained, for example, on different occasions, 
or in different laboratories, or even by different methods, and generally 
they will have different variances, so that some form of weighted mean 
is usually required. Bliss (1952) and Finney (1952) have discussed 
methods of calculating suitable weighted means and their fiducial limits 
in various circumstances, and Bennett (1954) has given further ex- 
tensions of the theory. 

All these methods, however, are appropriate only when the individual 
estimates of potency are independent, a condition which is not fulfilled 
in certain cases of practical interest. For example, in an experiment 
planned to examine the sampling variability of the vitamin content of 
milk, several samples of milk were taken and assayed against the same 
standard preparation in a multiple assay. A combined estimate of the 
vitamin content of the milk was then required, with estimates of fiducial 
limits based on the variation from sample to sample, which happened to 
be appreciable. A similar situation might arise if samples of grass 
from different parts of a field were tested, again in a multiple assay, 
for oestrogen content, and if an overall estimate of the oestrogen 
content were also required. In such cases, multiple assay designs are 
economical in time and material, but since the sample estimates of 
relative potency are obtained by reference to the same set of results 
for the standard, they are not independent, so that new methods for 
obtaining the limits of error are required. 

The method will first be illustrated by a practical example, and 
this will be followed by an outline of the derivation of the formula. 


Numerical example 


The data for the example are taken from the results of a variability 
study, made in collaboration with Dr. M. E. Gregory, on the micro- 
biological assay of riboflavin in samples from the same bulk of cows’ 
milk, using Lactobacillus casei as test organism. Estimates of the 


*N. I. R. D. paper No. 1728. 


40 


j 
| 
i 


MULTIPLE ASSAYS 41 


potency of the individual samples were required in order to examine 
the variation between them, and this variation was to be taken into 
account in assessing the limits of error for the combined estimate of 
potency obtained from different samples. The complete « — riment 
occupied 5 days, but one day’s results are sufficient here. 

Five samples of milk were assayed against a standard riboflavin 
solution at 4 dose-levels—1, 2,3 and 4 ml. There were two replications, 
arranged in randomized blocks in the two halves of a wire basket, and 
each operation was carried out first on one block and then on the other, 
in the same order. Twice as many tubes were set up for the standard 
as for each milk sample. 

The individual observations are set out in Table 1, and Table 2 
shows the results of the analysis of variance. As expected, the linear 
regression on log dose was highly significant, and the usual validity 
tests for a parallel-line assay were satisfied, since the mean squares 
for deviations from a linear regression and for the interactions (doses 
x standard v. milk) and (doses x milk samples) were not significant 
(P > 0.05). In fact the latter mean square approached significance at 


TABLE 1 
Assay OF VITAMIN Bi PoreNcy or 5 SAMPLES OF Cow’s MILK 
Individual results: 10° < log (optical density reading) 


Standard Milk samples 
Dose 
(ml.)| Duplicate tubes} Total 1 2 3 4 5 Total 
Block 1 


1 505 531 1036 505 477 544 491 525 2542 
2 690 690 1380 706 602 740 648 699 3395 
3 806 806 1612 833 732 833 756 813 3967 
4 851 863 1714 886 833 892 826 863 4300 


1 525 531 1056 544 462 597 484 519 2606 
2 699 708 1407 716 628 740 633 690 3407 
3 792 799 1591 820 732 833 732 799 3916 
4 857 857 1714 875 806 886 820 869 4256 


1 2092 | 1049 939 | 1141 975 | 1044 5148 
2 2787 =| 1422 | 1230 | 1480 | 1281 1389 6802 
3 3203 | 1653 | 1464 | 1666 | 1488 | 1612 7883 
a 3428 | 1761 1639 | 1778 | 1646 | 1732 8556 


Total 11510 | 5885 | 5272 | 6065 | 5390 | 5777 | 28389 


3 
At 
| 
al 
Block 2 
a 
Total over both blocks if 
ote 


BIOMETRICS, MARCH 1956 


TABLE 2 


Assay OF VITAMIN By. Potency or 5 SAMPLES OF Cows’ MILK 
Mean squares in the analysis of variance 


Source of variation Degrees of freedom Mean square 


Blocks 1 1 
Preparations 
Standard v. milk 1 1,064 
Between milk samples 14,069 = s,? 
Doses 
Linear regression 1 921,069 
Deviations 166 
Doses X preparations 
Doses X (standard v. milk) 120 
Doses X milk samples 263 
Blocks X (standard v. milk) 50 
Blocks X linear regression 927 
Residual 128 = s;? 


the 5% level, but the combined evidence of the 5 days’ results supported 
the assumption of the validity of the model. 

The highly significant (P < 0.001) mean square for differences 
between milk samples reflects the wide variation between the estimates 


of potency for the different samples, shown in Table 3. Further experi- 
mentation was carried out to investigate the reasons for this variation, 
but meanwhile a combined estimate of the relative potency of the milk 
and limits of error were required. Clearly the between-sample variation 
cannot be ignored in such a case, and it would be incorrect to follow 
the procedures outlined by Bliss (1952) or Finney (1952) since the five 
estimates of relative potency are not independent. Finney’s methods 


TABLE 3 
Assay OF VITAMIN B,, Potency or 5 SAMPLES OF Cows’ MILK 
Sample estimates of relative potency 


Sample Relative 5% fiducial 
potency limits 


42 

1 1.07 1.03, 1.12 
. 2 0.78 0.75, 0.82 

3 1.17 1.12, 1.22 
4 0.83 0.80, 0.87 
5 1.01 0.97, 1.06 
| 


MULTIPLE ASSAYS 43 


are in any case intended for use only if the sample estimates of relative 
potency are homogeneous. 

In the general case, let v be the number of samples of the test 
preparation, n be the number of replications for each test sample and 
n(p + 1) be the number of replications for the standard preparation. 
Let k denote the number of dose-levels for each material, with log 
doses denoted by xz, , x, --- , 2, . Finally, let Y; be the total response, 
over both preparations and all samples, at log dose x; , let Yr be the 
total response to all doses of all test samples and let Ys be the total 
response to all doses of the standard. 

In this example, therefore, we have v = 5, n = 2,p = 1, k = 4; 
Y, = 7240, Y, = 9589, Y; = 11086, Y, = 11984, Y, = 28389 and 
Ys = 11510. 

We first calculate [X] as k>-x; — (>>2,)’, giving [X] = 0.818. 

Then the mean slope b of the log dose/response line is given by 


b= {k — Dia mp + 
= 567. 
Since M, the log relative potency, is given by 

M = {(p + 1) Yr — vYs}/nokb(p + 1), 


we have M = —0.017. 
The next step in the calculations is to evaluate g, which is given by 


g = kt’si/n(p + + 1)[X]b’, 


where s; is the mean square shown in Table 2. The value of ¢ is obtained 
from standard tables at the required probability level, but a slight 
difficulty here is in deciding the appropriate number of degrees of 
freedom for entering the tables. This number lies between f, and 
(f: + fe), where f, and f, are the number of degrees of freedom for 
8s; and s3 respectively (see Table 2). Basing ¢ for the moment on f, , 
i.e. 4, degrees of freedom, g is found to be only 0.001, and since we 
have taken the largest possible value of ¢, g may safely be taken as zero 
in the remaining calculations. 

It is now possible to calculate the fiducial limits of the relative 
potency estimate. Since g is small, si. , the variance of M, is approxi- 
mately equal to (p + 1)s2/nvkb’, where 


The numerical example gives p = 0.0228, so that the variance of M 
is approximately equal to 0.001119. 


4 44 
te 
| 
| 


44 BIOMETRICS, MARCH 1956 


An estimate of the effective number of degrees of freedom, f, to 
ascribe to sy , may be found quite simply from an approximate formula 
given by Cochran (1951): 

(o + 1)? 

p , 

Since in our example we have f, = 29 and f, = 4, this formula gives 
f = 4.2, and interpolation in standard tables gives a value of 2.7 for 
t. It may be noted that the value of f is close to f, when the ratio of 
s2 to s; is high, as in this case; for a value of s; nearer s; , i.e. When the 
sampling variation is less important, f is nearer f, + f2 . 

From the formula Mf, , My, = M + ts, , the logarithms of the 5% 
fiducial limits of the relative potency estimate are thus found to be 
approximately —0.107 and 0.073, and the combined estimate of relative 
potency over all samples is therefore 0.96, with approximate 5% fiducial 
limits of 0.78 and 1.18. It is worth noting that had the between-sample 
variation not been taken into account, the fiducial limits would have 
been calculated to be 0.94 and 0.99, plainly under-estimates. 

In the full analysis of the experimental results, this procedure was 
followed to obtain estimates of M and s‘y for each of the 5 days, and 
these values were then used to test the homogeneity of the separate 
day estimates of relative potency. The variation from day to day 
proved to be greater than would be expected simply from a consideration 
of the variation between samples within days. Since, however, the 
relative potency estimates for different days were independent, the 
method given by Bliss (1949) could be used to obtain a weighted mean 
log potency value, and a standard error allowing for variation between 
days. 

The particular example given here, while demonstrating clearly 
the need to consider the variation between samples in work of this kind, 
may perhaps obscure the practical utility of the method presented in 
this paper for the combination of estimates of relative potency, because 
of the high ratio of between-sample to within-sample variance. As with 
the method for independent estimates of potency, given by Bliss, in such 
a case the estimates of the mean log potency and its variance are almost 
the same as would be obtained by direct calculation from the separate 
sample estimates of relative potency under the assumption of independ- 
ence. It is in the less extreme, and frequently occurring, cases that the 
method is most valuable, and it can, of course, be used over the whole 
range of values of the ratio of the two variance components, without 
any further assumptions as to the safety of using more approximate 
methods. 


f= 


| 
i 


MULTIPLE ASSAYS 45 


Derivation of formulae 


The formulae to be obtained in this section are more general than 
those quoted in the example, since it will not be assumed that g is 
negligible. The small modification of the method required when the 
replicates are not arranged in blocks will also be described. 

Under the assumptions for a parallel-line assay, the response at 
log dose x; of the standard in block j may be expressed as 


Ysii = a; + Bix; + € =1,--- ,n), 


where a@ and @ are constant for all observations in a block, and e is 
randomly and normally distributed with mean zero and variance o”. 

Similarly, for the response at log dose x; of the rth sample of the 
test preparation in block 7 we can take 


Yriir = + Bx, tet Ee ¢=1,--- 


where yu is the true log relative potency of the test preparation and 
e’, which is constant over any one sample, is normally distributed with 
mean zero and variance o”’. 

The values of 8; are assumed to be normally distributed about a 
mean £. 

The expressions for b and M have already been given in the example. 
To find the fiducial limits of M, Fieller’s theorem (1940) may be applied, 
leading to the following equation: 


__M t 
nok(p + I)(p + v + 1[X] 


(1 — g)nk[X\(p + +o + 
nok(p + 1)(p + + 1[X] 


where g is as defined in the numerical example, but with s;} replaced 
by o. In the same way as for the individual sample estimates of 
relative potency, the component of variance for the 6; does not occur 
in the expression for the fiducial limits. 

In practice, o” and o” are usually estimated from the observations 
themselves. If the design involves randomized blocks, the expected 
mean square for the differences between samples is (o” + nko’), with 
(v — 1) degrees of freedom. An estimate of o”, with 


— 1)(vk + k — 3) + npk} 


degrees of freedom, is obtained by combining the mean squares for 
differences between duplicates within biocks and all interactions with 


_ 
4 
BS 
= 
a 
+ 
| 
j 
| 
; 
| 
4 


46 BIOMETRICS, MARCH 1956 


blocks except the interactions (blocks x standard v. test preparations) 
and (blocks x linear regression). If replications are not arranged in 
blocks, an estimate of (o* + nko’*) is obtained in the same way as for 
designs with blocks, and an estimate of o°, with . 


— 1)’ + 1) + np} 


degrees of freedom, is given by the mean square for differences within 
each combination of preparation and dose-level. 

Let s} and s; denote the mean squares estimating o” and (o” + nko’) 
respectively, and let f, and f, denote the corresponding degrees of 
freedom. Then the equation for the fiducial limits may be written 


M, Mo = 


b — 9) 
_ ost eM? 
ta} 


The effective number of degrees of freedom, f, may be determined 
by Welch’s method (1947) as 


2 
aC + 1) 
fit2  fet+2 
but the simpler formula quoted in the example is usually adequate. 
Both formulae are approximate, and since Cochran’s formula gives a 
lower value, it leads to a conservative estimate of ¢. 

If g is not small enough to be negligible (i.e. is not less than about 
0.1), the estimation of g and f may be achieved by an iterative process, 
but this will rarely be necessary, since multiple assays usually occur 
as microbiological assays, in which case g is generally very small. 
When g is taken to be zero, the formulae simplify to those given in the 
example. The use of doses spaced at equal intervals on a logarithmic 
scale also simplifies the calculations. 

I am grateful to Mr. C. P. Cox for helpful suggestions. 


— 2, 


REFERENCES 


Bennett, B. M. (1954). Some further extensions of Fieller’s Theorem. Annals of the 
Institute of Statistical Mathematics, 5, 103. 

Bliss, C. I. (1952). The statistics of bioassay. New York: Academic Press, Inc. 

Cochran, W. G. (1951). Testing a linear relation among variances. Biometrics, 7, 17. 

Fieller, E. C. (1940). The biological standardization of insulin. J. Roy. Stat. Soc. 
Supp., 7, 1. 

Finney, D. J. (1952). Statistical method in biological assay. London: Griffin. 

Welch, B. L. (1947). The generalization of ‘“Student’s’’ problem when several differ- 
ent population variances are involved. Biometrika, 34, 28. 


MISSING AND “MIXED-UP” FREQUENCIES IN 
CONTINGENCY TABLES 


G. Watson 
Australian National University, Canberra, A. C. T. 


1. Introduction 


Missing and mixed-up values in experiments designed for analysis 
of variance cause but little trouble because the method of dealing with 
this type of data is well-known (see e.g. Cochran and Cox, 1950) and 
easily applied. The same problem may appear, though less often, in 
frequency data to be analysed by chi-square. For example, suppose 
that a botanist selects a random sample of a certain type of eucalypt 
in each of three rainfall belts. Each selected tree is classified as high, 
medium or low. The botanist intends to make a chi-square test, in 
the 3 X 3 table so obtained, that there is no association between height 
and rainfall. But when he comes to do the analysis he finds that, in 
the high rainfall belt, only the frequency of high trees is clearly desig- 
nated—the other two frequencies cannot be identified. How should 
he make his test? Alternatively, suppose that in his data the number 
of high trees in the high rainfall sample, and the total number of trees 
in that sample are missing. How should he make his test? 

The tests required are given below. The procedures for more 
complicated cases are also suggested. 


2. Missing frequencies 


Suppose first, that the cell frequencies f,; In an r X ¢ contingeney 
table are incomplete because f,,; is missing. Denote the existing row 


47 


©. 
Rite 
gat 
a 
q 
4 


48 } BIOMETRICS, MARCH 1956 


and column totals by R; (¢ = 1, --- , r) and C; (j = 1, --- , c) respec- 
tively and the total of the recorded frequencies by N. On the null 
hypothesis of no association, the observed cell frequencies are a sample 
N from a multinomial population with probabilities 


1 — @ 


where >>p; = 1, oq; = 1. The maximum likelihood estimates of the 
p; and the q; are easily seen to satisfy the equations 


=1,-:-,r; (1, D) 


=0, 


Pi 1 — 
+r = 0, = 2, --- 
and 
Np, 
C; 
Writing x for — p,q, , it is clear that) = —(N +2) 
so that 
A _C,+2 A C; 
NW+2’ (j = 2, »@. 


These equations are of the same form as the equations when no data 
are missing, if x is interpreted as the missing value. Introducing the 
expressions for #, and @, into the formula for z and solving the resulting 
quadratic, we find 


= 
N R, Cc, (1) 
since the other solution x = —N may be ignored. If then z is calculated 


from this formula and added to the first row, first column and grand 
totals, the chi-square computed by the ordinary method receives no 
contribution from the cell (1,1) since 


+2), 


Indeed this is an intuitive way of computing z. 


4 
‘¢ 
! 
if 


MISSING VALUES 49 


The resulting chi-square may be written algebraically as 


fii 
(N + 2)p.4; x 
with (r — 1) (ce — 1) — 1 degrees of freedom. The general theory of 
chi-square tests of fit for the multinomial distribution (see e.g. Cramer 
1946) gives us the chi-square 


fii 
(i,7)4(1,1) 
i= Pidh 
with (r — 1) (ec — 1) — 1 degrees of freedom. Since 
(N + — = N 


these two expressions are identical so that the intuitive method is 
correct. This is not completely analogous to the situation in the 
analysis of variance; there the treatment sum of squares, calculated 
from the data including the missing value estimate, is biased although 
the error sum of squares is not. 

As with the analysis of variance, the correct analysis, when there 
are several missing frequencies, varies with their disposition. The 
above analysis is easily extended and shows that the correct procedure 
can be based on the missing value formula (1) and the ordinary method 
of computing chi-square. The formula (1) should be used iteratively 
to give estimates of all the missing values. The degrees of freedom of 
chi-square will be (r — 1) (c — 1) less the number of missing frequencies. 


3. “Mixed-up”’ frequencies. 


Suppose, for example, that in our r X c contingency table, the 
identity of f,, and f,, is lost completely. Then the observed frequencies 
are, on the null hypothesis, a sample of N from a multinomial with 
probabilities 


If the loss is partial, we might be able to associate probabilities 
Tig, + (1 — w)pig2 and (1 — m)piq: + rpig2 with cells (1, 1) and 
(1, 2) where z is the strength of our belief that f,, belongs to cell (1, 1). 
Since in most practical cases our estimate of x would be vague, we take 
it to be 3; this gives the multinomial above. 


i 
j 
fe 
| 
4 
Riek 
4 
d 
O 


50 BIOMETRICS, MARCH 1956 
Applying maximum likelihood as before, we find that 


1, +++ 


(Ci + fu fra) NY 


(C, + — fir — 


(j= 8, +-- 


4 


The estimates are those which would be found intuitively. Chi-square 
must be calculated as for a multinomial with re — 1 cells with proba- 
bilities as given above and it will have (r — 1) (ec — 1) — 1 degrees of 
freedom. 

The same method may be used when another two, or more than two, 
frequencies are “‘mixed-up”’. 


REFERENCES 


Cochran, W. G. & G. M. Cox (1950). Experimental Designs, §3.7, 72-74, John Wiley 
& Sons, New York. 

Cramer, H. (1946). Mathematical Methods of Statistics, §30.3, 424-434. Princeton 
University Press, Princeton. 


| 
4 
‘ 


CONTRIBUTIONS TO SIMULTANEOUS CONFIDENCE 
INTERVAL ESTIMATION* 


K. V. RAMACHANDRAN** 


Institute of Statistics 
University of North Carolina 


1. Summary. In this paper simultaneous confidence bounds for 
parameters in two different situations are given and examples are 
worked out to illustrate their uses. The present paper is primarily 
concerned with the calculation of simultaneous confidence intervals 
for specified confidence coefficient 1 — a. We are controlling the 
risk of error “experiment wise’ in the sense of Tukey. Practical 
application of rules of estimation involving variable a would require 
additional tables. 


2. Statement of the Problems. 


(a) = 1,2, ---,k,j = 1,2, --- 1) be samples of sizes 
(n + 1) from k independent normal populations with means y; and 
variances = 1,2, --- ,k). Wheno; = o°(i = 1, 2, --- , k), simul- 


taneous confidence bounds connected with the means uy, are given 
{8, 9, 10]. In this paper we consider the problem of simultaneous 
confidence interval estimation on all ratios of the variances y;;- = 
7’ = 1,2, ---, k). The problem of simultaneous confi- 
dence interval estimation for all contrasts of the log variances will not 
be discussed here. 

(b) In factorial experiments we are usually interested in estimating 
linear functions of treatment effects, whose estimates are independently 
distributed with a common variance. Suppose, for example, that we 
have observations from a ?’ factorial experiment With factors A, , 
A, , ++ A, at t levels each and suppose that we are interested in 
simultaneously estimating the main effects only. We shall suppose 
that the experiment is so laid out that none of these is confounded in 
any replication. Let 6;(¢ = 1, 2, --- , p) denote the true main effects 
and let s* be an unbiased and independent estimate of the common 
(unknown) population variance o* based on q degrees of freedom 
(d.f.), (say, the error mean square in the analysis of variance). Let 
vi(i = 1, 2, --- , p) be the mean squares corresponding to the main 
effects of A,(i = 1, 2, --- , p).- It is known that when the true main 


*Work sponsored by the Office of Naval Research under Contract NR 042 031 at Chapel Hill. 
**Present address: Department of Statistics, University of Baroda, India. 


51 


= 
| 
| 
le 
= 


52 BIOMETRICS, MARCH 1956 
effects are zero, the v; are distributed as independent central chi-square 
variables with ¢ — 1 d.f. each. In factorial experiments it is well known 
that the ¢ — 1 d.f. belonging to the main effects of A,(¢ = 1, 2, --- , p) 
can be split up into ¢ — 1 orthogonal components of 1 d.f. each. The 
most useful splitting up of the sum of squares due to the main effect 
of A,(¢ = 1, 2, --- , p) is into 1 df. for linear, quadratic, cubic, --- , 
effects. Also 6;(¢ = 1, 2, --- , p) can be split up in a similar way into 
t — 1 components corresponding to the ¢ — 1 groups of the sums of 
squares of the main effects of A,(i = 1,2, ---, p). Let 
be the respective values such that ui, = 6,(¢ = 1,2, --- , p). 
Also if 3, , Zi2, *** , Zice-1) denote the ¢ — 1 components of the sum of 
squares due to the main effects of A; , then 27; = 1,2, ---, p). 
We shall consider the problem of simultaneous confidence bounds on all 
linear functions (of unit length) of the y,,;’s. 


3. Solution. 

(a) Under the set-up given in 2(a), it is known that 
2 n+1 


j=1 


where 
| = 1,2,--- ,k) 


has a chi-square distribution with n d.f. Also it is well known that 
= 7, = 1,2, ---, k), has an F distribution with 


(n, n) d.f. 
Also 
<P | (1) 
implies 
2 2 2 
i’ i’ id 
that is, 
iF’ 
2 2 (3) 
or 


si. (4) 


\ 
x 
4 


CONFIDENCE INTERVALS 53 


Let W, be the intersection of the regions (1) for 7 7’;7,7’ = 1,2, --- ,k. 
Then clearly the necessary and sufficient condition for the sample point 
to lie in W, is that 


= F’, (5) 
where 
F’ = sup (22) (6) 


Thus, if we set F’ = F’? , where F’’; is the upper a@ point of the distri- 
bution of F,,,, ratio with (n, n) d.f. and based on k variances [2], then 


8; 


forall 


The associated test of the hypothesis that all the / variances are equal, 
is obtained by using as the region of acceptance 

sup = = (Smax/Smin) (8) 
We have proved [6] that the associated test [3] is unbiased. 

(b) Under the set-up given in 2(b), it is well known that F; = 
(wi; — — 1)si] has an ordinary F distribution with (¢ — 1, 
q) df. (i = 1,2, --- , p), where s; = Est. var (x,;). Hence using methods 
similar to those given in [8, 9], it is easy to check that a set of simul- 
taneous confidence bounds on all linear functions (of unit length) of 
us, (for allz = 1, 2, --+ , p) is given by 


(9) 


A 
iM: 

+ 

| 


where u, is the upper @ point of the Studentized largest chi-square 
[7]. Upper 5 per cent points of the Studentized largest chi-square are 
given in Table 1 for certain values of the parameters. In this situation 
we have proved [7] that the associated test has the monotonicity 
property. 


4. Examples to illustrate the use of 2(a) and (2(b). 


(a) Table 2* shows measurements of tensile strength x,;(¢ = 


*Data taken from [4] page 3s. 


| 
= 
23 
| 
| 


54 BIOMETRICS, MARCH 1956 


TABLE 1 


Upper 5°) points of w when ¢ = 3 and for different. values of p and q 
\ 
A p 1 2 3 4 5 6 7 8 
q \ 
5 5.79 | 7.88 9.24 10.26 11.08 11.76 12.35 12.87 
“8 5.14 | 6.90 | 8.03 8.88 9.56 10.12 10.61 11.04 
7 4.74 | 6.28 | 7.27 8.01 8.60 9.09 9.51 9.88 
8 4.46 | 5.86 6.75 7.42 7.95 8.39 8.77 9.11 
10 4.10 | 5.32 6.09 6.66 7.12 7.50 7.83 8.11 
: 12 3.89 | 4.99 5.69 6.21 6.62 6.96 7.25 7.51 
16 3.63 | 4.62 5.23 5.68 6.04 6.33 6.59 6.81 
20 3.49 | 4.41 4.98 5.39 5.71 5.98 6.22 6.42 
24 3.40 | 4.29 4.82 5.20 5.51 5.76 5.98 Fe hg 
ro) 3.00 | 3.69 | 4.08 4.36 4.58 4.76 4.92 5.05 
: 1,2, --- ,5;7 = 1, 2, --- , 6) made on 6 specimens of rubber randomly 
selected from each of 5 different batches. 
We assume that the 2,;’s are from normal populations with means 


: 2 
uw, and variance o; . 


TABLE 2 
| Measurements of tensile strength (in kg/cm?) of specimens of rubber. 


Batch number 
Specimen 
number t=1 t=2 t=3 t=4 t=5 
j=l 177 116 170 181 177 
j=2 172 179 156 190 186 
j=3 137 182 188 210 199 
j=4 196 143 212 173 202 
7=5 145 156 164 172 204 
j=6 168 174 184 187 198 
Mean Z; 165.8 158.3 179.0 185.5 194.3 
Mean square s? 468.6 653.1 406.0 196.3 111.5 


Simultaneous confidence bounds for all ratios of the variances, y;;- = 
(i # 1%, = 1, 2, , 5) will be obtained by using (7). 
Now the upper 5 percent value of F,,,., with k = 5,n = 5 is 16.3. 
+] Hence substituting in (7) we get, with probability .95 


| 
| | 
| 


CONFIDENCE INTERVALS 5d 


OHO < ye < 11.6953 
O708 << < 18.8135 
< < 38.9114 
68.5040 
26.2202 
51.0317 
95.4756 


A 


IA 


0987 < Ye 
2041 < Yay 
3593 < Yes 
.1269 < Yas < 33.7133 
2234 < vas < 59.3532 
1080 < y45 < 28.6962 


(b) Table 3* gives the plan and yiclds of beans in pounds of a 2* 
factorial experiment conducted by the Rothamsted Experimental 
Station in 1936. 


lA 


(10) 


— 


IA IA IA IA 


TABLE 3 


Rep I Rep II 
Pp k d npk npk d Pp dnk 
45 55 53 36 43 42 39 34 
Block A Block A 
dnk dnp dpk n n dnp k dpk 
41 48 55 42 47 52 50 44 
375 351 
dp nk dk pk nk dp (1) np 
50 44 43 51 43 52 57 39 
Block B Block B 
dnpk (1) dn np pk dk dnpk dn 
44 58 41 50 56 52 54 42 
381 395 
Dung (D): 10 tons per acre 
Nitro chalk (N): 0.4 cwt N per acre | 


Superphosphate (P): 0.6 cwt P.O; per acre 
Muriate of potash (KX): 1.0 ewt KO per acre 


*Data taken from [1] page 160. 


= 
4 
f 


56 BIOMETRICS, MARCH 1956 


The four main effects D, N, P, K are given below 


“y= —1.00, ta = — 12.75 (11) 


= 1.75, Za 1.50 


Suppose we are interested in simultaneously estimating the four main 
effects only. The error sum of squares (in the analysis of variance) 
with 14 d.f. is 340.0. Hence s’ = 340.0/14 = 24.29 with g = 14 df. 
Now Est. var (z,;) = 24.29/2 = 12.14 (i, j = 1, 2, 3, 4) and upper 5 
per cent point of Vu is 2.84 (obtained by interpolation-in Table III [5)). 
Hence using (9) we have with a probability = .95 . 

[10.89 8.89] 
! — 22.64 — 2.86 
= 8.14 Mai 11.64 


(—11.39 < wa < 8.39) 


lA 
lA 


Mai (12) 


lA 
lA 


Notice that , , Ma: are respectively the true main effects of 

5. Acknowledgement. I wish to acknowledge my indebtedness to 
Professor S. N. Roy for his help and guidance in the preparation of 
this paper. 


REFERENCES 


{1] Cochran, W. G. and G. M. Cox. Experimental Designs, John Wiley and Sons, 
Inc., New York. (1950). 

(2] David, H. A. Upper 5 and 1 per cent points of the Maximum F-ratio, Biometrika, 
39 (1952), 422-424. 

[3] Hartley, H. O. The maximum F-ratio as a short cut test for heterogeneity of 
variance. Biometrika, 37 (1950), 308-312. 

[4] Pearson, E. S. and H. O. Hartley, Biometrika Tables for Statisticians, Vol. 1. 
Cambridge University Press (1954). 

[5] Pillai, KX. C. S. and K. V. Ramachandran. Distribution of a Studentized order 
statistic. Annals of Mathematical Statistics. 25 (1954). 565-572. 

[6] Ramachandran, K. V. On the Tukey test for the equality of means and the 
Hartley test on the equality of variances. (Submitted for publication in the 
Annals of Mathematical Statistics). 

[7] Ramachandran, K. V. On the simultaneous analysis of variance test. (Sub- 
mitted for publication in the Annals of Mathematical Statistics). 

[8] Roy, S. N. and R. C. Bose. Simultaneous confidence interval estimation. 
Annals of Mathematical Statistics, 24: (1953), 513-536. 

{9] Scheffe, H. A method for judging all contrasts in the analysis of variance. 
Biometrika, 40: (1953), 87-104. 

{10} Tukey, J. W. Allowances for various types of error rates. (Unpublished invited 
address, Blacksburg meeting of the Institute of Mathematical Statistics, 
March, 1952). 


aol 


RANDOM GENETIC DRIFT IN A TRI-ALLELIC LOCUS; 
EXACT SOLUTION WITH A CONTINUOUS MODEL* 


Moroo Kimura 


Department of Genetics, University of Wisconsin 


1. Introduction 


Random genetic drift is a stochastic process of change in gene 
frequency in finite populations due to random sampling of gametes in 
reproduction. Since the pioneering works by Fisher (1922, 1930) and 
Wright (1931), much attention has been paid to this phenomenon and 
many theoretical as well as experimental studies have been attempted. 
A brief review on this topic will be found in a previous paper (Kimura 
1955b). The evolutionary significance of random drift is still in dispute 
(Fisher and Ford 1947, 1950; Wright 1948, 1951), and decisive evidence 
for any conclusion is still missing. Haldane (1954) has suggested an 
analysis of frequencies of antigenic characters among neighboring 
populations for this purpose. Recently, Glass (1954) reviewed some 
evidence for the operation of random drift in human populations. From 
the genetical point of view, it is highly probable that there exists a class 
of genes so nearly neutral in selective value that random genetic drift 
plays a prominent role in determining the local differentiation of the 
gene frequencies. The best examples are to be found in certain isoalleles 
in Drosophila and other organisms. From the standpoint of math- 
ematical genetics, the problem of random drift provides an area where 
the theory of Markov processes finds important applications (Feller 
1951, Crow and Kimura 1955). 

In a recent paper the present author reported a complete solution 
of the process for the case of a pair of alleles (Kimura 1955a). With 
multiple alleles the problem becomes more difficult, and the solution 
even for three alleles (Kimura 1955b) contains a function C;,(z, y) 
and only the first three terms in the expansion are given explicitly. The 


*Paper No. 492 from the Department of Genetics, University of Wisconsin. Also Contribution 
No. 113 of the National Institute of Genetics, Mishima-shi, Japan. 


57 


bis 
: 
a, 
| 
vi 
4 


58 BIOMETRICS, MARCH 1956 


purpose of the present paper is to give the exact solution for a triallelic 
locus. First I shall summarize briefly the previous asymptotic results 
and then show how the exact solution may be obtained by the use of 
partial differential equations. 


2. The asymptotic formulae obtained by the calculation of moments of the 
distribution 


Consider a randomly mating population of effective size N. Let 
2, , y, and z,(=1 —z, — y,) be the frequencies of alleles A, , A, and A; 
respectively in the th generation. We denote by »’,') the m, nth moment 
of the distribution about zero at the éth generation, such that y’,°) = 
E(x"y’). In natural populations the number of individuals is usually 
large and we consider the case where N is sufficiently large that 1/N’ 
may be neglected. Due to the random sampling of gametes in repro- 
duction, the moments change gradually from generation to generation 
and, with suitable assumptions, we can derive the following infinite 


system of differential equations: 


dt 4N 4N Km-i,n 
(1) 
n(n — 1) 


4N Mm,n-1 (m,n 1, 2,3, 


If the initial frequencies of A, , A, and A; in the population are p, q 
7(0) 


and r respectively (p + ¢ + r = 1), we have u’,,,, = p”q" as the initial 
condition of (1). It can be shown that (1) has the solution of the form: 


mtn-1 

where the C,\")’s are constants. From this we can obtain the various 
probability distributions of the gene frequencies. The most important 
one is the joint distribution of the frequencies of A, and A, (and, 
therefore, of A; also) in the population which contains the three alleles. 
We denote by ¢(z, y | p, 9g; t) the density of the conditional probability 
that the frequency of A, lies between xz and x + dz and that of A, lies 
between y and y + dy in the tth generation (0 < x <z2+y < 1), 
given that they started from z = pandy = gatt=O0(O<p<pt+q 
< 1). As was shown previously (Kimura 1955b) ¢ must have the 
form; 


d(z,y|P,9;9 = 3 y) exp ++ ® i. (3) 


4N 


| 
be 


RANDOM GENETIC DRIFT 59 


The coefficients C,(x, y)’s are functions of x and y only and can be 
obtained from C,\°’s. The calculations involved, however, are so 
tedious that only the first few coefficients have been obtained; 


+ 


Ci(z, y) = 5!pqr, 
1 1 
3 2 3 3\ 2 
+(r + 3(p 1), 


1 1 


where r = 1 — p — gandz = 1 — x — y. Since for large ¢ the ex- 
ponential terms decrease rapidly as 7 increases, only the first few terms 
in (3) are important and we obtain the asymptotic formula; 


y | p,q; t) ~ y) exp + C.(x, y) exp (4) 
(t— @) 


Thus the final rate of decay of the distribution surface is 3/2N per 
generation. The probability @;°’ that all three alleles still co-exist 
in the population at the ‘th generation is obtained by 


(3) _ 3 
= 6opar exp { ON 


+ 90pqr{7(p* + + 1°) — 3} exp 


(5) 


This result can also be obtained by other methods. Detailed derivation 
of these formulas and graphical illustrations of the distribution surfaces 
are given in a previous paper (Kimura 1955b). 


3. Solution by partial differential equations 


The method of partial differential equations has been used for the 
study of random drift by Fisher (1922, 1930) and Wright (1945) and 
has proved to be a very powerful tool (Kimura 1955a). This method 


Q 
Q + 
| 
eee 
1 
re 


60 BIOMETRICS, MARCH 1956 


can be extended to the case of multiple alleles and multiple loci (Crow 
and Kimura 1955). With three alleles at a single locus the equation 
is written in the form; 


do 


at 12 {Vio} + —— +3 


= oy (6) 


0 0 
ax {M;.9} ay {M;,¢}, 
where 6x and éy are the rate of change in x and y per generation. In 
this equation V, W and M denote respectively the variance, covariance 


and the mean of the quantities identified by subscripts. In the present 
case 


— x) _ zy _ fi 
ON W Vey aN 


V3, 

and 
Therefore equation (6) becomes 
(7) 
1 


We start from a population in which the frequencies of A, and A, 
are p and q respectively. Therefore the initial condition is 


d(x, y | p,q; 0) = dx — p)- dy — Q), (7’) 


where 6(x) represents Dirac’s delta function. The equation (7) has 
singularities at the boundaries and no arbitrary conditions can be 
imposed there. 


To solve (7), we seek solutions of the following form as suggested in 


(3): 
Ze™, (8) 
where Z is a function of x and y but not of ¢. is given by 


(¢ + IG + 2) 


4N 


where 7 is a positive integer. 


48 
‘ 
. 


RANDOM GENETIC DRIFT 61 


By substituting (8) into (7) and transforming the independent 5 


variables by | 


z= pl—£ and y=pf& p,é<)), 


(7) becomes 
p(1 ap + ae @) 


The above transformation makes it possible to apply the standard 
separation technique to solve the equation. Let 


Z=R-0, 


where R is a function of p only and @ is a function of € only. By this 
substitution (9) can be changed into 


1 
p(l — Rag + 202 3°) dp + — + 4)p 
1d’e 1 doe 


By assumption, FR is a function of p only and hence the left side of (11) 
depends only on p, while 0 is a function of £ only and hence the right 
side of (11) depends only on é. It follows then that both sides of (11) 
must equal a constant which we shall designate by K. Thus (11) can 
be separated into the two equations; 


+ 2p(2 — 3p) +4p—KJR =0 (12) 


— 2) + Ke = 0. (13) 
First we identify (13) as the hypergeometric equation; 
— He” + fy — @ + B + — = 0, (14) 
where 
y=2, a+ and of = —K, 
in this case. 7 


iy. 
As 
; 
i 


62 BIOMETRICS, MARCH 1956 


Therefore we take 


34+ 094+ 4K 
a= 5) and 


Though we can not impose an arbitrary condition at the boundaries, we 
do want a solution which is finite at these singular points (¢ = 0 and 1). 
Among the two independent solutions of (14), only one of them i.e. 
F(a, B, 2, £) is finite at £ = 0 in this case. In order to find the condition 
which makes F(a, 8, 2, £) finite at the other singularity (§ = 1), we 
note the following relation: 


T(2)T(2 — a — B) 
I(2 — a)I(2 — B) 


+ B — 2) 
Noting that a + 8 = 3, we see that in order that lim;., F(a, B, 2, &) 


be finite, 2 — a must be a negative integer and 8 must be 0 or a negative 
integer. Thus the only possible values of K are expressed by 


K = (m — 1)(m + 2), 
where the m’s are positive integers (m = 1, 2, 3, ---). Corresponding 


to this eigen value K, we have a = m + 2, B = 1 — m, and if we put 
& = (1 — 6)/2, we then have 


F(a, B, = F(a, 8B, -1+a+ 8,1 — &) 


= m + 2,1 — m, 2, 1—*), 
except that it may be multiplied by a constant. 
It is convenient to express 0 in terms of the Gegenbauer polynomial 


T’,-:(8) which is defined by 


+ 2,1 — m,2, (15) 


It is known that the Gegenbauer polynomials T1(8)(n = 0, 1, 2, -++) 
form a complete orthogonal system for the intervals -—1 < @ < 1 with 
the density function (1 — 6°), and we have the following normalization 
integral (see Morse and l’eshhach 1953, p. 782-783). 


2(n + 2)(n + 1) 
(2n + 3) : (16) 


where 6,,.° is the Kronecker delta function. It is also worth noting 
that 7'.(0) is finite at £ = 1 for finite n. 
Next, we must reduce the equation (12) to a manageable form. For 


[ (1 — 6°)1(6)T".(6) = 8, 


s 
= 


RANDOM GENETIC DRIFT 63 


this purpose we put 
R = 
Then (12) reduces to the hypergeometric equation: 


— 9) SE + (20m + 1) 


— (m — i(m+71+3)U = 0. 
This gives us the Jacobi polynomial as a pertinent solution; 
U = + 3, 2m + 2, p). = m,m+1, ---) (18) 
Here the Jacobi polynomial is defined by 
J,(a,c, p) = Fia+n, —n,c, p). 


It is known that {J,} n = 0, 1, 2, --- form a complete orthogonal 
system for the interval 0 < p < 1 with the density function 2°’ 
(1 — 2x)* *. (Morse and Feshbach 1953, p. 780-781). 

Combining all the above results, we express the solution in the form 


|p, 9; = C(m — 1,1 — + 3, 2m + 2, p) 


m=1 i=m 


_G@+D)G+2) 


1 
x T m-1(9) ew { 4N 


or by putting m — 1 = nandi — m = j, 


d(x, y |p, 9; = C(n, j)p"J + 5, 2n + 4, p) 


(19) 
where the C(n, j)’s are constants and 
p=at+y. (20) 


rty’ 


Now we use the initial condition (7’) to determine C(n, 7). 
From (19), we have (putting ¢ = 0) 


(x — p)-8y — @ = Clo, (2n 45,2n+4, (21) 


We next multiply both sides of (21) by p"**(1 — p) J,(2n + 5, 2n + 4, p) 
(1 — 6°) T...(6) and integrate over 0 < p < 1, -1 < 6 <1. If we 


Bes 
: 
x 
‘a 
fig 


64 BIOMETRICS, MARCH 1956 


use the orthogonality relations 


p” — p)J\(2n + 5, 2n + 4, p)J.(2n + 5, 2n + 4, p) dp 
0 (22) 
~ (j + Qn + + Qn + Qn + 


and (16), the right side becomes 


Cin’, + + + 2)-(2n’ + 2)'(2n! + 3)! 
(k + 2n’ + 3)"k + 2n’ + 4)" 2k + Qn’ 4+ 5) 
If we notice that the Jacobian of the transformation (20) is 


a(6, p) _ _2 


the left side becomes 


Spq(p + g)" (1 — p — OT. (2n’ + 5, 2n’ + 4,p+4+ Q). 
Therefore 


Cin, j) + + 2n + + 2n + 5) 
= NG + Din + Dm + 2)-Qn + 2!2n + 3)! 


(23) 
por(l — r)"T) (2n + 5,2n + 4,1 7), 


where r = 1 — p — q. 
We then write the final result in the form; 


x = 2) +5,2n+4,1—2) (24) 


x exp 4N 


where z = 1 — x — y. The functions 7)(-) and J;(-, -, -) are re- 
spectively the Gegenbauer and Jacobi polynomials as defined above. 

It is not hard to verify that (24) generates the asymptotic formula 
given in (4), if we notice that 


T(@)=1, = 38, --- 


J ¢, pe) = 1, J\(5, 4, p) = 1 — 


i 
i 
i 
© 
n=0 
- 
4 


RANDOM GENETIC DRIFT 65 


Also it should not be difficult to prove the uniform convergence of the 
series (24) for t > 0, since the exponential terms decrease very rapidly 
for large n and j. 


4. Extension to more than three alleles 


The important point in the above treatment is that the method 
can be extended to a larger number of alleles. 


Thus with four alleles, say A, , A, , As; and A, whose frequencies 


are x, y, zand u respectively (x + y + z+ u = 1), we have the following 
partial differential equation; 


The transformations 
x= pk 
y = p(l — 
z= p(l — — ») 


reduce the above equation to: 


ll 


— £) n(l — 
p(l — p)®,, + p Pe + al — 2) %, + 2(3 4p)®, 
2(1 — 2(1 — 2n) 


Carrying out a similar but somewhat more complicated procedure we 
can separate this into three equations for each of the three variables 
p,£and 7. The final solution is then expressed as a linear combination of 
the products of the solutions of these component equations and an 
exponential term for ¢. The details will not be given here. 

The above argument will be enough to suggest the techniques by 
which the general case of an arbitrary number of alleles can be solved. 
However, additional techniques will be needed to make the mathe- 
matical manipulations manageable. 


5. Summary 


The exact solution for the process of random genetic drift in a 
triallelic locus has been obtained by solving the partial differential 
(KXolmogorov) equation (7) based on a continuous model. 


p + 6)® = 0. 
a 
te 
* 


66 BIOMETRICS, MARCH 1956 


The probability distribution of gene frequencies in the unfixed 
classes where all the three alleles coexist (24) indicates that the distri- 
bution surface finally becomes flat and decreases in height at the rate 
of 3/(2N) per generation as opposed to 1/(2N) for a pair of alleles. This 
confirms the asymptotic solution (4) previously obtained by another 
method. 

The applicability of the present method to cases with more than 
three alleles has been discussed. ‘The biological implications of the 
problem have been considered in detail elsewhere (Kimura, 1955b). 


ACKNOWLEDGMENT 


The author wishes to express his thanks to Dr. J. F. Crow for help 
and criticism. 


REFERENCES 


Crow, J. F. and M. Kimura. 1955. Some genetic problems in natural populations. 
Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Prob- 
ability. (In press). 

Feller, W. 1951. Diffusion processes in genetics. Proceedings of the Second Berkeley 
Symposium on Mathematical Statistics and Probability. Univ. of California Press. 
pp. 227-246. 

Fisher, R. A. 1922. On the dominance ratio. Proc. Roy. Soc. Edinb. 42: 321-341. 

. 1930. The distribution of gene ratios for rare mutations. Proc. Roy. Soc. 
Edinb. 50: 204-219. 

Fisher, R. A. and E. B. Ford. 1947. The spread of a gene in natural conditions in a 

colony of the moth Panazia dominula L. Heredity, 1: 143-174. 
and . 1950. The ‘Sewall Wright Effect.” Heredity, 4: 117-119. 

Glass, B. 1954. Genetic changes in human populations, especially those due to gene 
flow and genetic drift. Advances in Genetics, 6: 95-139. 

Haldane, J. B.S. 1954. The statistics of evolution. Evolution as a Process. pp. 
109-121. London. 

Kimura, M. 1955a. Solution of a process of random genetic drift with a continuous 
model. Proc. Nat. Acad. Sci., 41: 144-150. 

1955b. Random genetic drift in multi-allelic locus. Evolution, 9: 


419-435. 
Morse, P. M. and H. Feshbach. 1953. Methods of Theoretical Physics. New York. 
Wright, S. 1931. Evolution in Mendelian populations. Genetics, 16: 97-159. 
. 1945. The differential equation of the distribution of gene frequencies. 
Proc. Nat. Acad. Sci., 31: 382-389. 
1948. On the roles of directed and random changes in gene frequency in 
the genetics of populations. Evolution, IT: 279-294. 
1951. Fisher and Ford on “The Sewall Wright Effect.”” American 
Scientist, 39: 452-479. : 


| 
4 
4 
| 
4 


MULTIVARIATE ANALYSIS AND AGRICULTURAL 
EXPERIMENTS 


D. J. FINNEY 


Agricultural Research Council Unit of Statistics, 
University of Aberdeen 


The aim of statistical science must always be to aid the research 
worker in making the best possible use of his efforts and his results; 
one important function for Biometrics is to provide a forum for the 
exchange of opinions on how this aim can be achieved in the biological 
sciences. Amongst the many papers on statistical science published 
today, some appear to find new outlets for mathematical theory without 
materially assisting scientific research. In recent years, I have been 
particularly aware of papers of this kind on multivariate analysis. 
Statisticians evidently hold widely divergent views on the practical 
importance of various types of multivariate analysis: I suggest that 
we need to examine carefully their relevance to the interpretation of 
experimental and observational data. 

In this note, I propose to be severely critical of the use of multi- 
variate analysis of variance and the construction of canonical variates 
in the analysis and interpretation of agricultural and other experiments. 
My argument can best be presented by reference to a particular example, 
and I therefore discuss in detail the recent paper by R. G. D. Steel 
(1955). Of course I intend no personal attack on Dr. Steel’s work, 
but his interesting paper happens to illustrate my criticisms especially 
simply and clearly. Employment of the methods he describes appears 
to be increasing, and other applications that are, in my view, equally 
unfortunate can be found (e.g., Dutton, 1954; Quenouille, 1950). 
Questions of choice of method are often less simple than an ardent 
partisan would have them: I look forward to having my own outlook 
criticized as severely as I criticize that of Dr. Steel and others, for 
clear thinking about the ultimate objectives of statistical analysis 
is more important than the vindication of a particular point of view. 

Steel has discussed an experiment comparing the yields of 25 varieties 
of alfalfa, grown on the same four randomized blocks in 1949 and 1950. 


67 


Be 
L 
7 
on 
| 


68 BIOMETRICS, MARCH 1956 


He has given an excellent account of the numerical operations involved 
in a bivariate canonical analysis on the pairs of yields from each plot, 
and has shown the essential simplicity of calculations that sometimes 
terrify when expressed purely mathematically. Unfortunately, he 
has not made clear the aim of the experiment and the manner in which 
his analysis enables the agricultural scientist to reach conclusions 
relevant to his problem that could not have been obtained by simpler 
alternatives. He implies that such methods should be more generally 
adopted for experiments whose yields are recorded in several successive 
years, or in which more diverse multiple observations are made on each 
plot; I believe this to be a dangerously misleading policy. 

The choice of a multivariate canonical analysis for experiments 
such as that discussed by Steel perhaps originates in a belief that the 
primary purpose of statistical analysis there is the making of tests of 
significance. This is not so. In the alfalfa experiment, a general test 
of the significance of differences between varieties is of practically no 
interest. Personally I should be surprised if a good experiment on as 
many as 25 varieties of a crop failed to show significant differences 
between them. If it did not, I should suspect inadequate replication, 
unexpectedly large variation by comparison with the general habit of 
the crop, or careless management of the experiment, all of which are 
features of the circumstances of the experiment and not intrinsic 
characteristics of the varieties. It is surely almost inconceivable that 
25 varieties should be distinguishable by some characteristics yet should 
have identical yield parameters: indeed, I suspect that any two recog- 
nizably distinct varieties could be shown as differing “significantly” 
in yield by sufficient increase in replication or by test over a sufficient 
number of years. Absence of a significant result in a particular ex- 
periment is usually a commentary on the experimental technique 
rather than an addition to knowledge of the varieties. 

The chief value of the analysis of variance in a variety trial is to 
determine an error variance, whence are derived assessments of standard 
errors for differences between pairs of varieties or groups of varieties. 
These standard errors are important in further stages of a selection 
programme or in the formulation of practical recommendations, since 
they provide a basis for judging whether certain differences are large 
enough to make particular courses of action desirable. For this purpose, 
clearly, the variates analysed—or at any rate the variates whose mean 
values are finally summarized and discussed—must be those measures 
of crop performance that the investigator considers relevant to the 
judgements he must make. They may be the basic measurements 
(e.g. weights) made on the crop or they may be derived quantities 


i 
| 
A 
fa 
dis 


MULTIVARIATE ANALYSIS 69 


(combinations of yields in several years, or starch equivalent, or per- 
centage dry matter), but they can be defined only by someone who 
knows the purpose of the experiment. They cannot be deduced purely 
from statistical analysis of the numerical values of the basic measure- 
ments. For experiments on a plant that has two distinct possible 
uses (e.g. flax-linseed), the analyses that are important will depend 
upon which use the investigator has in mind. Again, in the alfalfa 
experiment, records of plant height or of leaf area might have shown 
relatively greater differences between varieties than did the yields, 
but canonical variates including them also would probably have been 
less helpful to the investigator (though more highly significant in their 
differences) than those obtained from yields alone.* 

Had I been asked to analyse the alfalfa experiment, I should certainly 
have made an analysis of variance of the total yields over the two 
years, as this total production is likely to be the chief consideration in 
deciding the relative merits of varieties. I should also have analysed 
the difference in yield between the two years, as an aid to comparing 
varieties in respect of seasonal variability in yield, a factor that might 
be important in preferring one variety to another. The fact that any 
tests of significance in the two analyses might not be independent of 
one another would not worry me: the nature of the experiment demands 
consideration of these two combinations of yield. Nevertheless, Steel’s 
Table 4 suggests that any correlation between these two combinations 
(arising from unequal variances in 1949 and 1950) is quite trivial. In 
some circumstances, I might want analyses of yields in the two years 
separately, instead of or in addition to analyses of the total and difference, 
although for a forage crop grown on the same land for several years 
these will usually be less interesting.** If I were told that one ton of 
alfalfa in 1949 was twice as valuable as one ton in 1950, I might wish 
also to make an analysis of a new variate formed as twice the 1949 
yield plus the 1950 yield, although it is difficult to see what importance 
this could have in assessing the general merits of the varieties unless 
the greater value per unit weight in 1949 were a characteristic of the 
first-year growth of alfalfa rather than a consequence of particular 
economic conditions in that year. 

I cannot think of any circumstances in which the investigator 
would be interested in Steel’s two canonical variates, or in other 


*He might wish to take account of these variates also, if height or leaf area had any economic or 
other importance comparable with that of yield, but he would certainly adopt his own weighting of the 
relative importance, not that represented by a canonical variate. 

**Of course, the complete analysis of variance and covariance shown in Steel’s Table 2 is a conven- 
ient computing procedure if more than two analyses of linear functions of the yields are required. 


1 
thy 
ae 


x 


70 BIOMETRICS, MARCH 1956 


combinations of yields for the two years chosen entirely on the internal 
statistical evidence of the experiment. Admittedly the first of these 
variates is an optimal linear discriminant function between varieties, 
but for two reasons such a discriminant is without importance here. 
First, a function whose components are yields in two particular past 
years could not be used for discrimination in the future: the appropriate 
components, yield in 1949 and yield in 1950, can never be measured 
again! Secondly, the 25 varieties under test can presumably best be 
discriminated by whatever characteristics (habit of growth, leaf form, 
flower colour, etc.) identify them as varieties to the plant breeder, and 
no one would suggest that a character as variable in its manifestation 
as yield should be employed for the purpose! 

It happens that, in the alfalfa experiment, the sum and difference 
of the yields are almost as effective discriminators between varieties as 
the two canonical variates. My objection to the latter, however, is 
not that in this instance they could be replaced by something simpler 
without loss but that, whatever the numerical results of this experiment 
had been, their values would be of no interest to the experimenter 
whereas the sum and difference would always have been important. 
The mean values shown in Table 7 may represent a mathematical 
simplicity of structure, but as they stand they are not the slightest 
help to the investigator who wishes to know which varieties to subject 
to further more refined comparison or which varieties to release for 
commercial use. Steel claims that ‘This analysis helps locate varieties 
that are consistently good (poor) but sometimes do even better (poorer) 
than expected and ones that are good (poor) in some years and not 
exceptional or are even poor (good) in others’. The terms good and 
poor, however, are meaningful only with reference to factors outside 
the experiment, concerned with the use to which the produce is to be 
put: no internal statistical analysis of multiple measurements on each 
plot can produce a function that is a measure of “goodness’’. 

Steel suggests that the property of independence possessed by the 
means in his Table 7 ‘is useful in making exact probability statements’. 
My contention is that inexact or non-independent probability state- 
ments about quantities that are meaningful to the interpretation of the 
experiment are far preferable to exact statements about quantities 
that are only mathematical abstractions. To have exact statements 
about the right quantities would be better still. Possibly the theo- 
retical exactness of simultaneous fiducial or confidence statements 
about the various comparisons of totals and differences of yields that 
are important to the research worker could be improved by first formu- 
lating such statements for independent canonical variates and then 


ig 
} 
7 


MULTIVARIATE ANALYSIS 71 


converting these into statements about the variates required. The 
tests of significance in the canonical analysis would still be unimportant, 
but that analysis might prove to be a convenient computing technique 
for obtaining what was ultimately wanted. In the alfalfa experiment, 
I am sure that nothing of practical importance would have been gained 
by such refinements, but the possibility may merit further study as a 
realistic application of multivariate theory to plant breeders’ problems. 
Its elaboration can be left to those who are expert in that theory: it 
might, for all I know, follow as a fairly easy development of the work 
of Tukey and others on simulataneous interval estimation. My present 
concern is to persuade biometricians that, without such interpretation 
in terms of estimation instead of significance testing, in field experiments 
and in many other research problems the type of multivariate analysis 
illustrated by Steel is usually inappropriate and often actively mis- 
leading. 


REFERENCES 


Dutton, A. M. (1954). Application of some multivariate analysis techniques to data 
from radiation experiments. In Statistics and Mathematics in Biology, edited by 
O. Kempthorne, T. A. Bancroft, J. W. Gowen, and J. L. Lush, Ames: Iowa State 
College Press. 

Quenouille, M. H. (1950). Multivariate experimentation. Biometrics, 6, 303-316. 

Steel, R. G. D. (1955). An analysis of perennial crop data. Biometrics, 11, 201-212. 


| 
“3 
4 
BS 
2 
= 
| or 


THE RELATION BETWEEN QUANTAL AND GRADED 
RESPONSES TO DRUGS 


P. S. Hewett anp R. L. PLackett 


Pest Infestation Laboratory, Slough, Bucks, and 
Department of Applied Mathematics, University of Liverpool 


1. A curious dichotomy exists in the study of biological responses 
to drugs. Responses, justifiably enough, are regarded as of two distinct 
types—quantal and graded. Quantal responses are those which classify 
an organism or other unit of biological material as having responded 
or not; for example, death, paralysis, etc. A graded response is such 
that the single organism gives a response in quantitative terms; for 
example, a change in weight, or a change in blood pressure. The 
statistical treatments of the two types of response show some similarities, 
largely because regression techniques are used for both; but biologically 
speaking the quantitative descriptions of the two types of response 
have been kept rigidly separate. No attempt seems to have been made 
to discover any connection between the dose-response relationships for 
quantal responses on the one hand and those for graded responses on 
the other. 

2. The purpose here is to view the events following the strictly 
controlled administration of a drug in such a way that the probable 
interrelationships of succeeding graded and quantal responses can be 
seen and formulated quantitatively. First, however, it is necessary 
to comment on the interpretation of functions relating quantal re- 
sponses to dose. 

3. We shall, as is commonly done, assume that the relation between 
a quantal response and dose represents a cumulative distribution of 
tolerances, where the tolerance of an individual organism (or other 
unit of biological material) is the dose of drug just insufficient to make 
it show the quantal response concerned. It is fitting that a concept 
such as that of a tolerance should be scrutinized from time to time: 
Berkson (1951) has in fact challenged the concept, but he proposed 
no alternative interpretation. He cited an experiment in which indi- 
vidual men reacted differently on different occasions; but an interpre- 
tation in terms of tolerances of the results of dosage on a certain occasion 
does not depend upon the tolerance of each organism remaining constant 


72 


QUANTAL AND GRADED RESPONSES 73 


in time. There is, indeed, experimental evidence that the response-time 
of an individual organism to a given dose of drug can vary from one 
occasion to another (Bliss & Beard, 1953), and by analogy tolerances 
might do the same. Berkson’s other example concerned X-rays, which 
are outside our scope. However, he ridiculed the idea of tolerance by 
comparing it to “the resistance of the targets to getting hit in the 
bull’s-eye”’; but such an interpretation might be highly suitable if the 
size of the bull’s-eye varied from target to target. 

If the experimental conditions are rigorously controlled, and if one 
organism shows a quantal response to a drug whereas another treated 
in the same way does not, it seems reasonable to attribute the difference 
to an inherent difference between the organisms during the period of 
action of the drug. The concept of tolerance can be compared to that 
of weight. At a given time the weights of individuals of a species may 
differ, and the weight of each may alter with time; but the fact that a 
set of individuals may even be ordered differently by weight on different 
occasions does not invalidate the supposition that an individual has a 
definite weight at a certain moment. We conceive a tolerance, if such 
exist, to be an inherent characteristic of an organism, just as its weight 
is, though of course a tolerance might show temporal variation about 
its general level relatively greater than does weight. On the evidence 
at present available we see no reason to reject the concept of tolerance 
as a working hypothesis. 

4. It seems certain that quantitative ckynges of some kind always 
accompany the action of a drug, even if n:-st of these changes are not 
so readily observable as the graded responses . ommonly used in bioassay. 
These changes might be increases in the conce. .ration of certain 
substances within the organism—for example the accumulation of 
acetylcholine following administration of an anticholinesterase. A 
considerable weight of evidence is against the supposition that drug 
action consists of a few indivisible physiological events in an organism 
(see Clark 1933). It seems legitimate to regard any of the quantitative 
changes resulting from drug action as graded responses. 

5. We can now put forward a hypothesis that leads to a unification 
of the mathematical treatment of the two types of response; namely, 
that an individual organism responds quantally if an underlying quanti- 
tative change that results from administration of the drug, and that can 
be regarded as a graded response, reaches a certain level of intensity 
characteristic of that individual organism. If the dose of drug is insufficient 
to bring the quantitative change to the critical level, the quantal 
response will not occur. Hence the idea of a tolerance follows im- 
mediately. 


4 
tr 
4 
34 


74 BIOMETRICS, MARCH 1956 


6. It should not, however, be assumed that a quantal response in 
an individual organism could necessarily be related to any of the 
quantitative changes resulting from administration of the drug. Differ- 
ent quantal responses might stem from more or less distinct. trains of 
events. If the situation for two trains could be represented thus 


Quantitative changes A — Quantal responses A’ 
Dose of drug 


Quantitative changes B — Quantal responses B’ 


A’ could most usefully be related to A, and B’ to B. A’ could not usefully 
be related to B nor B’ to A unless the correlation between A and B 
happened to be very high. 

7. In the analysis of quantal response data, we generally suppose 
that the tolerance distribution has a frequency function of the form 
Bf(a + Bx), where x measures the amount of the drug on some con- 
venient scale, and a, 8 are unknown constants. The proportion of 
organisms responding to the drug is then 


a+Br 
P = / f(d dt. (1) 
For example, probit analysis (Bliss, 1935; Finney, 1952) arises by taking 
= o(t) = (2x)? exp (—4¢); 
and logit analysis (Berkson 1944, 1953) when 


= A(t) = cosh’ (30) 


The relative advantages of probits and logits have been discussed at 
length elsewhere, and we do not pursue this question here. Whatever 
the choice of f(t), the observations consist of the numbers of organisms 
which have responded, or failed to respond, at various doses. From 
these data, by assuming that organisms behave independently of one 
another, and applying some principle such as maximum likelihood or 
minimum x’, we arrive at estimates of a and 8. 

8. On the other hand, when we are concerned with graded responses, 
each organism exhibits a response 4, which is measured on a continuous 
scale, and thus provides more information than the elementary dead-or- 
alive or other binary classification of quantal response. For each value 
of x, the responses are distributed among the population of organisms 
with a frequency function which can be converted to the normal, logistic, 
or any other form, by suitably choosing the scale of y. In particular, if 
the distribution is normal, with variance independent of x, and the 


a 


QUANTAL AND GRADED RESPONSES 75 


factors affecting response enter linearly, the way is then open for the’ 
application of analysis of variance techniques. 

9. In order to connect these two forms of analysis we postulate for 
every organism the existence of a critical graded response c, which is 
such that the quantal response occurs if the graded response exceeds c, 
but not otherwise. When z is given, the simplest possibility is that in 
which c has the same value «x for each organism, but more generally y 
and c will have some bivariate distribution among the population of 
organisms. To show how the tolerance distribution is derived from the 
graded response distribution, we consider two special cases. 

(i) The critical graded response is a constant x; and the distribution 
of graded responses has mean 6 + y x and frequency function f{(y — 6 — 
yx)/o}/o where f is symmetrical about zero. If f is normal, o is the 
standard deviation; if f is logistic, ¢ is (S.D.)~W/3/7. Then 


o 
giving 
v *) 
o f 


for the frequency function of the tolerance distribution. This formula- 
tion of P agrees with (1) provided that 
a=(0—x«)/o and B= y/o. (3) 


(ii) The quantities y and ¢ are jointly distributed with means 
6 + yx and « respectively, in such a way that the distribution of 
z = y — c has the frequency function 


1 ve + 


o o 


where f is symmetrical about zero. Then 
P = Prob. y—e>0) = [ +4) 
: 


which again gives 


(vet 


o o 


for the frequency function of the tolerance distribution. Evidently the 
same tolerance distribution can be derived from widely differing as- 
sumptions. 


a 
om 
ae. 
be 
; 
(4) 
4 4 
aS 
Fig 


= 


76 BIOMETRICS, MARCH 1956 


10. The interpretation of quantal response data by a tolerance 
distribution can, therefore, if philosophically unsatisfactory, be replaced 
by an interpretation in terms of graded responses. With a constant 
critical graded response, both formulations are equivalent symbolically 
if we replace characteristic tolerance by graded response and dose by 
critical graded response. In most biological problems where graded 
responses are available, it would of course be natural to analyse the 
data without introducing a critical graded response, classifying the 
observations on a quantal basis, and making the calculations much more 
laborious. However, when controlling the quality of mass-produced 
articles, we may either measure a physical dimension or—more quickly 
and cheaply—see whether it exceeds one fixed standard, falls below 
another, or does neither. The amount of information lost by con- 
sidering an experiment only from the quantal viewpoint has been 
examined in this context by Stevens (1948), and had previously been 
investigated in a general context by Pearson (1920). These writers 
show that high efficiencies are possible under certain conditions, but 
—returning to the biological situation—the question of efficiency is 
not the only factor we have to consider, because we lose not only 
information in Fisher’s sense, but any knowledge of the spread of the 
graded response distribution, as is shown above by the way the param- 
eter o is absorbed into @ and 8. 

11. In view of the results stated in (i) and (ii) of section 9, it would 
obviously be interesting to compare experimental estimates for 6 and 
y/o. We sought to compare the value of 8 for a quantal response with 
that of y/o for a graded response to the same drug in the same organism. 
Moreover, for a valid comparison to be possible, the two responses should 
have resulted from the same train of biological events, as explained in 
section 6. If in a number of comparisons the value of 8 commonly 
approximated to that of the corresponding y/c, the value of y/o would 
have seemed largely to determine that of 8; and this would have indicated 
the theory to be correct. 

Needless to say, pairs of values satisfying the requirements were 
not available. Instead we could only collect sets of values for the 
two quantities (or rather, their reciprocals) for an assortment of bioassay 
methods. Even so, it was difficult to assemble the values from the litera- 
ture on a fair and rational basis. It happens that estimates of 1/8 and 
a/v have been used to compare the sensitivities of different methods of 
bioassay. In consequence Gaddum (1933) and Bliss & Cattell (1943) 
collected a number of estimates from the literature. However, each 
value from an assay method of one type was not matched by a value 
from one of another type that corresponded in the sense explained 


i 
| 
| 
pat 
| 
a 
: 


QUANTAL AND GRADED RESPONSES 77 
above. Moreover, the samples may be biased: high values for the two 
quantities may not have been recorded in the literature, because high 
values indicate that the experimental methods by which they were 
obtained are inefficient for bioassay. For the want of better samples, 
we have based table 1 on their data, and it shows the frequencies with 
which the different values of 1/8 and o/y fell within consecutive ranges. 
In order to obtain a more homogeneous sample, Table 1 includes values 
only from assay methods that employed vertebrate material. 


TABLE I 
The frequencies of different values of estimates of 1/8 for quantal responses and o/y 
for graded responses,* determined from a sample of bioassay methods for a variety of 
drugs, employing vertebrate material. 


Number of values 
Range of 
1/B or o/ 1/8 (quantal o/w (graded 
responses) responses) 
0 - 0.05 2 2 
0.05- 0.1 ll 6 
0.1 -0.2 23 13 
0.2 -0.3 4 13 
0.3 - 0.4 4 8 
0.4 -0.5 3 3 
0.5 - 0.6 1 0 
0.6 - 0.7 1 0 
0.7 -0.8 2 0 
0.8 - 0.9 0 0 
0.9 - 1.0 1 0 
Total 52 45 


*Values taken from Gaddum (1933) and Bliss & Cattell (1943). 


For the reasons given, it would be idle to apply a x’ or other test to 
compare these two frequency distributions. It is sufficient to notice 
that there are obvious similarities with respect to both position and 
spread. Thus, although the evidence is very indirect and uncertain, 
there is enough agreement between theory and the results of a hetero- 
geneous collection of experimeats to suggest that the value of ¥/o for 
an underlying graded response may often be an important factor in 
determining the value of 8 for a quantal response. In order to test the 
theory adequately, however, data would probably have to be obtained 
especially for the purpose. 


a 
= at 
‘ 


BIOMETRICS, MARCH 1956 


SUMMARY 


The events following the administration of a drug are so viewed 
that the probable interrelationship of the resultant graded and quantal 
responses can be formulated quantitatively. Experimental data 
suitable for testing the predictions from the theory were not found in 
the literature, but such as are relevant lend slight support. To obtain 
suitable data, special experiments would probably need to be carried out. 


REFERENCES 


Berkson, J. (1944). Application of the logistic function to bio-assay. J. Amer. 
Statist. Assoc., 39, 357-65. 

Berkson, J. (1951). Why I prefer logits to probits. Biometrics, 7, 327-39. 

Berkson, J. (1953). A statistically precise and relatively simple method of estimating 
the bio-assay with quantal response, based on the logistic function. J. Amer. 
Statist. Assoc., 48, 565-99. 

Bliss, C. I. (1935). The calculation of the dosage-mortality curve. Ann. Appl. Biol., 
22, 134-67. 

Bliss, C. I. and R. L. Beard (1953). Static and dynamic variation in the response of 
an insect to sub-lethal doses of two gases. Connecticut Agr. Exp. Sta. Bull. no. 577. 

Bliss, C. I. and McK. Cattell (1943). Biological’ assay. Ann. Rev. Physiol., 5, 479- 
539. 

Clark, A. J. (1933). The mode of action of drugs on cells. Arnold, London. 

Finney, D. J. (1952). Probit analysis. Second edition. Cambridge University Press. 

Gaddum, J. H. (1933). Report on biological standards. III. Methods of biological 
assay depending on a quantal response. Spec. Rep. Ser., Med. Res. Coun., Lond., 
no. 183, H. M. Stationery Office. 

Pearson, K. (1920). On the probable errors of frequency constants. Part III. Bio- 
metrika, 13, 113-32. 

Stevens, W. L. (1948). Control by gauging. J. Roy. Statist. Soc., Ser. B, 10, 54-98; 
discussion 98-108. 


4 
| 
\ 


ONE LIKELIHOOD ADJUSTMENT MAY BE INADEQUATE 
H. W. Norton 


University of Illinois 


Fisher (1925) says “. . . since the equations of maximum likelihood 
do not always lend themselves to direct solution, it is of importance 
that, starting with an inefficient estimate, we can, by a single process 
of approximation, obtain an efficient estimate .... It is sufficient for 
our purpose that the error of estimation is of the order n™*” and “. . . 
starting with an inefficient statistic, a single process of approximation 
will in ordinary cases give an efficient statistic differing from the 
maximum likelihood solution, by a quantity which with increasing 
samples decreases as n™'”’. The problem is more complicated than Fisher 
made it appear. This paper uses a simple example to show that, even 
“in ordinary cases’, one likelihood adjustment is sometimes quite 
inadequate. 

Fisher (1950, chapter 9) takes the genetical problem of the frequency 
of crossing over to illustrate the problem of statistical estimation. He 
indicates five different statistics which might be adopted as solutions 
of the problem, each being consistent and each having sampling variance 
inversely proportional to the size of the sample. The first and second 
of these five estimates are inefficient, and might be used as starting 
points for the adjustment Fisher proposed. 

The first of these estimates and its successive improved values 
appear in Table 1. The first entry in the second column is the value 


79 


4 
1 
a. 


80 BIOMETRICS, MARCH 1956 


TABLE 1. 


Application of the method of maximum likelihood to improve an inefficient statistic. 


Order of Improvement Estimate Adjustment —#L/ae 
0 0.057046 —0.031419 12341 
1 0.025627 0.007374 51118 
2 0.033001 0.002522 31802 
3 0.035523 0.000189 27787 
4 0.035712 0.0000003 27520 


given by Fisher for the inefficient statistic; the last is the maximum 
likelihood estimate, also given by Fisher. The fourth column gives 
the negative of the second derivative of the logarithm of the likelihood 
function. From the second derivative, the sampling error of the maxi- 
mum likelihood estimate is found to be 0.006028, so that the once- 
improved estimate, 0.025627, differs from the likelihood estimate by 
over 1.67 times the sampling error. Hence the once-improved estimate 
must be considered inadequate. Furthermore, the twice-improved 
estimate is afflicted with an error of estimation of nearly half a sampling 
error and would usually be thought unsatisfactory. 

A separate consideration of substantial importance is that several 
iterations are necessary before the second derivative reaches good 
agreement with its value at the maximum of the likelihood. In par- 
ticular, Table 1 shows that, if the second derivative were not recalculated 
after the first adjustment, the estimated sampling variance would be 
over twice the sampling variance of the maximum likelihood estimate. 
On the other hand, if the second derivative is recalculated after the 
first adjustment, the estimated sampling variance would be barely 
half as large as that of the likelihood estimate. Misestimation of the 
sampling variance is a serious error, just as it usually is a serious error 
to use an inefficient statistic. Furthermore, it may be worthwhile to 
recalculate the second derivative so as to speed convergence to the 
maximum likelihood estimate. 

The process of approximation proposed by Fisher is simply Newton’s 
method, and so is of wide generality. However, the example shows 
that it is too much to expect that one adjustment of an inefficient 
estimate will result in reasonable agreement with the maximum likeli- 
hood estimate. In fact, one adjustment will be adequate only when 
the value of the second derivative of the logarithm of the likelihood, 
evaluated at the inefficient estimate, differs little from its average 


= 
ta 


LIKELIHOOD ADJUSTMENT 81 


value in the interval from the inefficient estimate to the maximum 
likelihood estimate. This will usually be true only when the inefficient 
estimate is already close to the maximum likelihood estimate, that is, 
when its efficiency is fairly high. 

In the example, the efficiency of the inefficient estimate is only 
about 14%, and it is over 3.5 sampling errors away from the maximum 
likelihood estimate. Also, the second derivative changes rapidly, and 
its value of — 12341, at the inefficient estimate, is a poor approximation 
to —18175, its average value over the interval from the inefficient 
estimate to the maximum likelihood estimate. Hence the first adjust- 
ment is nearly half again as large as it should be. 

These considerations apply a fortior? to simultaneous estimation 
of two or more quantities. It appears that the only safe rule, in cases 
which have not been thoroughly investigated mathematically, is to 
repeat the iterative process until the adjustments are small and the 
covariance matrix is stable. 


REFERENCES 


Fisher, R. A. (1925). Theory of statistical estimation, Proc. Camb. Phil. Soc., 22: 
700-725. 

Fisher, R. A. (1950). Statistical Methods for Research Workers, 11th ed., Oliver and 
Boyd, Edinburgh. 

Fisher, R. A. and Bhai Balmukand (1928). The estimation of linkage from the 
offspring of selfed heterozygotes. J. Genetics, 20: 79-92. 


| 
a : 
hy 


QUERIES 


GeorcE W. SNeEpEcorR, Editor 


QUERY: In field experiments with plant spacing, there will be 
119. different numbers of plants per plot for the various spacings, if 

plot size be constant. We have an experiment with maize, a 4° 
factorial confounded in 4 randomized blocks. The factors are levels of 
nitrogen, phosphorus and spacing. We have run into difficulties in 
attempting to analyze effects of spacing upon proportion-of-fruitful- 
plants, number-of-cobs-per-fruitful-plant, weight-of-grain-per-fruitful- 
plant. Inspection of the data reveals that variation in fruitful plant 
number tends to be related to the mean, direcily in the case of spacings, 
and inversely in the case of levels of phosphate. How may the data best 
be handled? Would covariance on number of plants per plot be ap- 
propriate? If so, what about heterogeneous error regressions? 


Since number of plants per plot is purposely varied by the 
ANSWER: different spacing treatments, the only reasonable value 

to examine in assessing the influence of these treatments 
on plant fruiting is the proportion of fruitful plants and not the actual 
count of fruitful plants per plot. The straight analysis of variance of 
this variable is first examined since the F values are not seriously 
affected by heterogeneous errors. (See Biometrics 3:1—52) 


Mean Squares 
Source df. 
% fruitful plants Degrees 

Blocks 3 118 214 
Nitrogen 3 25 117 

Linear 1 49 280* 

Quad. 1 15 65 
Phosphorus 3 85* 122* 

Linear 1 143* 140 

Quad. 1 111* 224* 
Spacing 3 119* 240** 

Linear 1 334** 685** 

Quad. 1 16 35 
N XP 8 17 36 
N XS 8 20 31 
PXS 8 31 40 
Error 27 26 42 


The analysis of percentages indicates a linear and quadratic effect 
of phosphorus and reference to the means below shows that the effect 
is a positive one which levels off at the higher rates. Spacing has a 


82 


‘ i 
| | 
k 
| | 
| 
{ 
| H 


QUERIES 83 


strong linear effect. No interactions are important. Following are 
the means for the three factors. The range within each treatment is 
given in parentheses as a rough guide to the relative variation. 


Nitrogen Phosphorus Spacing 
Level 
% Degrees % Degrees % Degrees 
1 93.2 (17.5) 76.3 90, 8 (30) 74.4 90.7 (20) 73.0 
2 93.7 (20.0) 76.6 94.6 (18) 79.2 93.0 (30) 77.6 
3 93.4 (30.0) 77.8 96.2 (9) 80.9 96.1 (10) 80.3 
4 95.9 (22.5) 82.1 95.0 (20) 78.2 97.0 (15) 81.9 


It is common experience to find heterogeneous variances among 
means of percentage data if some of the means are near the limit and 
others are near the middle of the range. In this example the highest 
spacing mean is 97.0% and the lowest is 90.7%. A Bartlett test of the 
heterogeneity of variances within the spacing treatments gives x” = 
14.5** (d.f. = 3). The pattern of heterogeneity may be seen from the 
ranges in the above table. 

The usual correction for this difficulty is to use the angular trans- 
formation. When this is done, the analysis of variance changes to that 
shown in the last column of the analysis table. These results are a 
little unusual for this type of transformation. The effect of nitrogen 
was only mildly suggestive before, but now a linear effect is significant. 
The phosphorus effect is more strongly quadratic. The spacing effect 
and interactions are relatively unchanged. The heterogeneity of 
variances within spacings now gives x” = 4.3 which is non-significant. 

These results suggest that the transformation be examined a little 
more closely. The spacing treatments involved a considerable difference 
in numbers of plants per plot, with a consequent possibility of an effect 
on the binomial portion of the variance. However, there seemed to be 
a counterbalancing drift in the means as shown below: 


Variance within spacing treatments 
Spacing in row Avg. No. 

Plants/Plot % Degrees Expected 
Binomial 

10” 109 37.7 35.2 ae 

15” 74 64.4 92.2 8.8 

20” 58 9.5 40.4 6.5 

25” 46 17.8 56.4 6.3 


i 

| 

| 

ae 
A 


84 BIOMETRICS, MARCH 1956 


The above table shows that the binomial portion of the variance is 
relatively constant, and furthermore, it constitutes only a quarter of 
the total error (= 26). In view of this, there is reason to question the 
appropriateness of the transformation used. If the sole criterion of the 
success of a transformation is the equalization of the variances, this 
one is satisfactory, even though it was suggested under false pretenses. 
The other criteria that need to be given consideration are discussed by 
Bartlett (Biometrics 3:39). They seem to be as well (or better) satisfied 
by the transformed data as they were on the original percentage scale. 
The association of the means and the variances which was observed by 
the investigator is largely due to two or three extreme values which 
fall at different places in the scale of the different factors studied, and 
therefore cannot be corrected by transformation for all factors. It is 
possible that the investigator can find from his field notes explanations 
for these extreme values and can either learn more about the observed 
responses or justify discarding them from the analysis. There does not 
appear to be any further help to suggest from the data themselves. 

The response to plant spacing can be regarded as a continuum over 
the range of interest. If specific points along this range are fixed experi- 
mentally and their effect on a growth variable measured, experimental 
control must be precise to be interpreted. Covariance analysis makes 
the tacit assumption that the slope of the response surface is the same 
for all treatments. This obviously cannot be so in this experiment. 
Furthermore, irregularities in stand give different patterns of effect 
depending on whether they represent single missing plants or multiple 
skips. This is bad enough in experiments -studying other treatments, 
but in spacing studies it is most difficult. Therefore, it is suggested 
that no covariance be attempted. 

J. A. RIGNEY 


QUERY: [ have recently been trying to overcome a problem 

120 occurring in fertility records of a group of Merino sheep. 
Fertility (lambs born per year) in Merinos is usually restricted 
to 3 classes—no lambs, one lamb and occasionally twin (2) lambs. 
My problem is to analyse, for purposes of estimating heritability and 
genetic correlations, the records of say 300 ewes with fertility records 
in six years to obtain an estimate of the repeatability of fertility between 
years. The binomial distribution of these data, however, precludes the 

use of analysis of variance and intra class correlation. 

My aim in estimating repeatability is to express records obtained 
for less than 6 years on a comparable basis with the full 6 year records 


| 
| 
1 
+ 


QUERIES 85 


(by means of Legates and Lush, J. Dairy Sci., 1954, P. 744—most- 
probable-(re) producing-ability). 

I would be grateful if you could suggest a solution of this problem 
or an alternative method of utilising all records regardless of their 
being for a period of less than full term. 


You might try using the analysis of variance. It is not 
ANSWER: precluded as completely as you imply. Probably your 

distribution isn’t purely binomial, else you wouldn’t 
have any repeatability. It is more likely to be approximately binomial 
for each ewe by herself but around a probability of twinning which 
varies from ewe to ewe. That is, the variance within ewes may be 
purely binomial but that between ewes contains an additional element. 
What you really want is the size of that additional element as compared 
with the variance within ewes. Number of lambs at a birth can be 
considered a continuously distributed characteristic which, for ana- 
tomical or mechanical reasons, is limited to three classes (four if triplets 
occur also) in its expression. This is to say that the difficulty is not 
in any fundamental difference between binomial and continuous 
distributions but in the coarseness of grouping. Where the variance 
within ewes is such a large fraction of the total as it will be here, the 
correlation between mean and variance won’t interfere much with the 
analysis of variance, although it is part of the problem. With the 
grouping this coarse and the classes so few, no transformation of scale 
is likely to help. If you wish to pursue this line of thought further, you 
might consult an article by W. G. Cochran in 1943 in the Journal of 
the American Statistical Association, 38:287-301. The title of the 
article is “Analysis of Variance for Percentages Based on Unequal 
Numbers”’. 

Another method of attack, which is simple and easy to explain, is 
merely to compute the regression of the average number of lambs at 
future lambings on the number of lambs at the first lambing, or at the 
second, or at any other une lambing. This regression 7s the repeatability 
coefficient you wish. That this is so is shown in the following equation, 
which would be perfectly valid for continuously distributed character- 
istics and is approximately so, even for such coarse grouping as this. 

Let Y = average number of lambs at future lambings 


X = number of lambs at first lambing (or at some other one 


lambing). 
{ = repeatability of number of lambs from one lambing to another 
n = number of future lambings averaged in Y 


| 
& 
q 
| 
| 


86 BIOMETRICS, MARCH 1956 


1+(n— 
b =| n |- 
1+(n— 


The averaging in Y affects rxy in such a way as to cancel exactly its 
effect on cy . Therefore, n disappears from the regression of Y on X 
(although it would not disappear from the regression of X on Y, and it 
does affect the sampling error of byx). For complete accuracy the 
preceding formula requires (1) that cx be the same for all lambings, 
whether Ist, 2nd, or nth, and (2) that ¢ be the same between X, and X, 
as it is between X, and X,,, X, and X, , etc. Minor variations in ox 
or in rx,x, won’t matter much, although perhaps ox, is enough smaller 
than the other cx’s that this should be examined. In working your data 
in this way, you would, for example, merely sort all of the ewes you 
have on the number of lambs at their first lambing. Then you would 
compute the actual average number of lambs at all future lambings for 
those who had no lamb the first time, for those who had one lamb the 
first time, and for those who had two lambs the first time. (If age of 
ewe affects the average number of lambs born, you would need to make 
allowance for that in the averages. I would suppose this unimportantly 
small, except in the difference between first lambing and other lambings). 
The repeatability you wish would be the future difference between the 
zeros and the ones, or the future difference between the ones and the 
twos. This raises at once (and incidentally provides the means for 
answering) the question of whether the difference between zero lambs 
and one lamb is the same sort of thing as the difference between one 
lamb and two. It is readily imaginable that the cases of zero lambs 
might represent a different kind of a phenomenon or, at least, be much 
farther away (or closer) on the scale of fertility from the single lambs 
than the single lambs are from the twins. This you examine by seeing 
whether the difference in Y between the zeros and the singles is the 
same as the difference in Y between those who have singles and those 
who have twins. That question, I think, you have to answer (at least 
to your own satisfaction) fairly early in the study. 

This would be a study of repeatability of the first lambing, or the 
value of the first lambing as an indicator of all the future lambings. The 
correlations of future lambings with each other are not involved, except 
as they affect the ratio between cy and cx . It is readily imaginable 
that the first lambing might in some way be different from later lambings 
so that the later ones might be correlated more or less closely with 
each other than they are with the first lambing. For a complete study, 
you would want to investigate that, too, and combine with this earlier 
estimate whatever information the data contain concerning the re- 


=t 
ox 


2 


QUERIES 87 


peatability of lambings subsequent to the first. You eould do this 
by treating each lambing order in turn just as you treated the first one. 

This raises some questions (minor in this case, 1 am sure, because 
repeatability will be low) about the best way to pool all this information. 
If you go about it by studying in succession, the regression of the future 
on the first, the future on the second, the future on the third, etc., and 
then pooling all these, you have combined all the information from the 
different ewes by counting each ewe k — 1 times where k is the number of 
lambing seasons she was present. This gives a tiny fraction more 
emphasis to ewes with many lambing seasons than they should have, 
although the undue part of this extra weighing of them is extremely 
tiny where the repeatability is as low as this will be. I see no reason 
why a little extra weighing of those with many lambings should con- 
sistently tend to tilt b upward (or consistently downward). The smaller 
variation at the first lambing might make a difference of one lamb at 
the first lambing mean more than a difference of one lamb at the second 
or third lambing. 

If the size of ¢ is known, or you are willing to postulate it, the re- 
gression of number at future lambings on average number at the first 
two lambings, or the first m lambings, can be computed by the following 
equation. Let W be the ewe’s average in her first m and Y be the ewe’s 
average in her next n lambings. 


fi + (n — 
ow 1+ (m— + (m— (m — 
Ox 
~ Li +(m— 


For rigorous exactness this also requires that ox, = ox, = ox, and that 
Ty.x, = = = but minor variations from that will have 
scarcely any effect. Only the possibilities that o,, is distinctly smaller 
than the standard deviations at later ages and that zero lamb is physio- 
logically a different kind of phenomenon not on the same scale as singles 
and twins, need concern you here, I think. You can test this formula 
by sorting your ewes on their W values and computing directly the 
regression of Y on W. If the directly observed regression agrees fairly 
well with the formula involving ¢ and m, I would think you could 
proceed with rather high confidence. The direct determination of the 
regression would convince some of your readers who might be mystified 
by the general formula. 


| 
4 
| 
vi 
¥ 
Pa 
5 
aa 


88 BIOMETRICS, MARCH 1956 


If the coarseness of grouping introduces only random errors, it makes 
the observed ¢ lower than would be found if the characteristic could be 
measured on a continuous scale. This tendency follows Shewhart’s 
(1926) formula: 

ox, Ty, 

where the ¢ subscript indicates the true value and the o subscript indicates 
the observed value, the latter being the true value, plus or minus a 
random error of observation. Since W won’t be grouped as coarsely as 
X, this probably will make your directly observed regression of Y on W 
slightly higher than the one computed from m and the ¢ value observed 
in the regression of Y on X. Other than taking this into account, I see 
nothing you can actually do about the coarseness of grouping, since 
the number at a birth is necessarily discrete. 


J. L. 


| 

4 
| 

\ 
i 


ABSTRACTS 


Meeting of The Biometric Society, French Region, December 7, 1955 


P. CAZAMIAN ET J. L. SOULE. Mise en Evidence par la 
364 Méthode Statistique de Divers Facteurs Susceptibles de Modifier 
les Résultats de la Spirographie Chez le Mineur. 


‘L’exploitation par la méthode statistique des résultats de 1355 
explorations spirographiques effectuées en 1951-52-53 au Centre 
Médical d’Etudes des Houilléres des Cévennes, sur des mineurs ou 
anciens mineurs, le plus souvent 4 l’occasion d’expertises en silicose, 
a permis de dégager certains faits intéressants, parmi lesquels: 

1°) Une liaison réelle mais faible entre l'image radiologique et 
amputation de la fonction respiratoire (la petitesse de la corrélation 
explique quantitativement les contradictions entre les conclusions des 
divers auteurs selon le nombre de cas étudiés). 

2°) L’étude de la régression en fonction de |’Age des caractéristiques 
respiratoires a fait apparaitre des anomalies statistiquement signifi- 
catives qui ont disparu apres scission de la population en mineurs 
actuels et anciens mineurs. Ceux-ci, lorsqu’ils ne sont pas silicotiques, 
ont des caractéristiques respiratoires trés significativement supérieures 
& ceux-la. On a pu préciser que ceci concernait surtout les mineurs 
ayant récemment quitté le fond (moins d’un mois). 


365 D. BARGETON, P. DESJOURS ET F. GIRARD. Etude des 
Fluctuations Spontanées de la Ventilation Pulmonaire au Repos. 


Chez un animal intact éveillé au repos, on observe des fluctuations 
importantes spontanées de la ventilation moyenne pendant une minute 
V, et de la fréquence respiratoire f pendant le méme intervalle. II 
existe entre elles une relation linéaire (1) V, = a + bf le coefficient de 
corrélation atteignant ou dépassant 0,95. Les parametres a et b ont 
des valeurs stables et de l’ordre de grandeur respectivement de la 
ventilation alvéolaire et de l’espace nuisible. 

La relation (1) s’interpréte comme exprimant le fonctionnement 
de la régulation chimique et l’écart-type sur la régression caractérise 
la sensibilité de cette régulation. 

Une relation analogue se retrouve 4 Vintérieur d’un mouvement 
respiratoire (2) si l’on compare son volume V, Asa durée P, V, = b-+aP 
le temps perdu étant inférieur 4 la durée d’un mouvement. 

Cette liaison est assurée par le contréle proprioceptif, elle disparait 
sous narcose et par exclusion fonctionnelle des nerfs vagues. Les 
changements de forme des mouvements respiratoires sont assujettis 
& une liaison stochastique qui tend a limiter les variations de la ventila- 
tion alvéolaire. 


89 


ne 
14 
| 
\ 
= 


THE BIOMETRIC SOCIETY 


German Region. A meeting in Giessen, Germany, on July 23, 
under the joint sponsorship of the German Region and the Mathematical 
Institute of the Justus-Liebig-Hochschule, offered the following program: 
Wilhelm Ludwig (Heidelberg), Problem of the optimum in biomathe- 
matics, and Harold Hotelling (University of North Carolina), General- 
ized analysis of variance for two or more dimensions of each individual. 

Japan. At a joint meeting at Kyoto University on October 19 
of the Biometric Society and the Research Association of Statistical 
Sciences, the following papers were presented: T. Seguchi, On the 
estimation of birth and death rates; K. Ito, On a test for the multi- 
variate Behrens-Fisher problem; K. Saito, On sampling on successive 
occasions; M. Ogawara, Stochastic prediction of earthquakes; K. 
Sakai, 8. Shiraska and T. Okuno, Statistical analysis of an individual 
competition test on sweet potato; H. Inamura and H. Ohata, Statistical 
method in the breeding of barley; and M. Masuyama, Elementary 
method of construction of orthogonal arrays by IBM-602a and by 
hand-sorted punched cards. 

Région Frangaise. Lors de la réunion de la Société Frangaise de 
Biométrie, qui eut lieu mercredi 7 Décembre & |’Ecole Normale Su- 
périeure 4 Paris, Messieurs Bargeton, Dejours et Girard discutérent 
“Etude des fluctuations spontanées de la ventilation pulmonaire au 
repos”, et Docteur P. Cazamian et J. L. Soule ‘‘Mise en évidence par 
la méthode statistique de divers facteurs susceptibles de modifier les 
résultats de la spirographie chez le mineur’’. 

British Region. The annual meeting of the Region was held at 
the Wellcome Research Institute in London on December 12. The 
following regional officers were elected for 1956: President, D. J. 
Finney; Treasurer, A. R. G. Owen; Secretary, E. C. Fieller; Committee 
members for 1956-58, Sir Ronald Fisher, F. Yates. After the annual 
meeting the following papers were read and discussed: J. G. Skellam, 
A kinetic theory of transects, and Mrs. M. E. Wallace, The use of 
affinity data in chromosome mapping. 

Netherlands. Members of the three biometrical clubs in the 
Netherlands met on December 22 in Utrecht at Hotel Smits, where 
papers were read by C. Postma on principles of multifactorial analysis, 
by G. de Leve on its statistical aspects, and by G. Hamming on re- 
gression and factor analysis. Through lack of funds, it has been 
necessary to discontinue a small periodical in Dutch called ‘Biometric 
Contacts”, which had been published for two years by the combined 
biometrical clubs. 


90 


$ 
; 


THE BIOMETRIC SOCIETY 91 


ENAR. The Region met jointly with the American Statistical 
Association and the Institute of Mathematical Statistics in New 
York City on December 27-29, with a program of seven scientific 
sessions and the annual meeting. At the sessions, 177 members regis- 
tered. The opening session on December 27 concerned Probability 
and Statistics in Genetics, with papers by M. Kimura, Some problems 
of stochastic processes in genetics; H. Levene, Estimation of parameters 
in genetic models; and N. E. Morton, Sequential tests for detection 
of linkage in man. On December 28, the first session, on Bioassay, 
opened with two papers on the Precision of microbial assays for vitamin 
B,, , the first by W. Weiss, H. Edelson and H. W. Loy and the second 
by C. I. Bliss. 8. R. Ames then reported on A slope-ratio liver-storage 
bioassay for vitamin A, and W. R. Bryan on The assay of Rous sarcoma 
virus by tumor response in chickens. The next session concerned 
Statistical Studies of Accident Proneness and the Contagion of Acci- 
dents, the first being a general review by J. Neyman, the second a 
discussion of asymptotic tests and power of tests by C. H. Kraft, and 
the third a limit theorem on related conditional distributions by G. P. 
Steck. An afternoon session on the Interpretation of Genetic Data 
offered papers by A. Kimball, Approximate confidence intervals for 
specific locus mutation rates; T. W. Horner and C. R. Weber, Theo- 
retical and experimental study of selfed populations; O. Kempthorne, 
Epistacy under selfing; and D. 8. Robson, Application of the K, 
statistics to genetic variance component analysis. 

The program on December 29 opened with a session on Subjective 
Testing with the following papers: C. I. Bliss and M. Greenwood, 
A rankit analysis for paired comparisons in taste testing; J. W. Hopkins 
and N. T. Gridgeman, Some stimulus response relations in pair-ranking 
taste experiments; E. F. Murphy, Problems needing answers; and G. E. 
Ferris, Three useful designs in taste testing. A noon program on 
Statistics in Medical Experimentation listed papers by D. Blackwell 
and J. L. Hodges, Elimination of selection bias in medical experimen- 
tation; T. S. Ferguson, Estimation of bacterial densities; and A. Berger, 
On comparing survival rates. A session of contributed papers had the 
following program: H. W. Norton, One likelihood adjustment may be 
inadequate; D. R. Cox, Some general remarks on quick tests of signifi- 
cance; A. E. Sarhan, The teaching of statistics in Egypt; and M. C. 
Sheps and P. L. Munson, The use of chick comb biological assay in the 
study of urinary androgens. At the closing annual meeting, attended 
by 24 members, the Region reelected D. B. Duncan as President and 
A. M. Dutton as Secretary-Treasurer, and named E. J. deBeer and 
W. J. Youden members of the Regional Committee for 1956-58. 


| 
+a 
é 
7 
ag 


92 BIOMETRICS, MARCH 1956 


Bowl presented to Miss Cox. A small informal breakfast meeting 
was arranged at the Hotel Biltmore in New York City on December 
29, attended by the following officers of the Society, W. G. Cochran, 
C. I. Bliss, D. B. Duncan, W. J. Youden, J. W. Hopkins, and Gertrude 
M. Cox. On behalf of the Society as a small token of its esteem, President 
Cochran presented Professor Cox with a Revere silver bowl which 
carried the inscription: ‘‘Presented to Gertrude M. Cox by the Bio- 
metric Society in grateful appreciation of her outstanding services as 
the first Editor of BIOMETRICS”. 

General Officers for 1956. The Council of the Society has elected 
the following officers for 1956: President, E. A. Cornish, CSIRO, 
University of Adelaide, Australia; Secretary, M. J. R. Healy, Roth- 
amsted Experimental Station, England; and Treasurer, C. I. Bliss, 
The Connecticut Agricultural Experiment Station and Yale University, 
USA. It is hoped to complete the transfer of the Secretary’s office 
from New Haven to Harpenden during the spring. 

The following were elected to Council for 1956-58: F. J. Anscombe, 
C. Barigozzi, W. G. Cochran, A. Groszmann, L. Martin, C. R. Rao and 
E. J. Williams. 


; 
q 
’ 
a 


THE VARENNA SEMINAR IN BIOMETRY 


From a report by L. CAVALLI-SFORZA 


Most universities in continental Europe offer little or no instruction 
in biometric methods for biologists (sensu latissimo). To meet the 
growing demand for such instruction among researchers through a 
brief extra-university course, an International Seminar on Biometric 
Methods was organized by the Italian Region of the Biometric Society 
under the auspices of the [UBS and UNESCO. It was held at Varenna, 
Italy, on Lake Como, on September 7-23, 1955. In view of its novelty, 
at least in Europe, and of the assistance that it may give in planning 
future similar undertakings, the report of the Director of the Seminar, 
Dr. L. L. Cavalli-Sforza, is summarized here for a wider audience. 

Organization. Two principles were followed in organizing the 
Seminar: (1) to conduct the basic courses and as many others as 
possible in the local language, Italian, and (2) to have as international 
a teaching body as possible within this limitation, in order to maintain 
the present international unity in biometrical thought and methodology. 
Contacts with prospective teachers were started a year in advance but 
detailed syllabi were not discussed until February to June, 1955. The 
site chosen, Villa Monastero at Varenna, has in recent years become a 
favorite resort for international summer courses and symposia. It 
offers complete isolation, a beautiful garden, a lecture hall, a smaller 
room for practical exercises, and living accommodations for the teaching 
staff. 

Announcements of the Seminar were sent in March to all sar yd 
departments that might be ‘nterested, to governmental and other 
research institutes, to Italian members of the Biometric Society, and 
to European secretaries of the Society. Applications numbered nearly 
100. Although an enrollment of 25 to 30 students was planned originally, 
this number was doubled by having the students work in pairs at the 
calculators. Forty men and 16 women attended, all but one of them 
Italian residents, the other a Swiss. Forty-one held university posts 
and the others research positions. Although applicants with some 
statistical background were favored, the students varied markedly in 
this regard. As indicated by their degrees, 19 had basic training in 
medicine, 18 in the natural or biological sciences, 10 in agriculture, 
and 8 in other fields. 

The Seminar was financed primarily by a grant of $1000 to the 
Biometric Section of [UBS from UNESCO and by student fees (at $15). 
Approximately $300 was appropriated by the Italian Region of the 


93 


| 
i 
} 


94 BIOMETRICS, MARCH 1956 


Biometric Society. The Villa was provided rent free by Ente Villa 
Monastero (Como Province) and the ealeulators obtained on loan, 

Program. ‘There were 14 working days in the 17 days of the course, 
each organized on a common pattern. To accommodate the students’ 
varying statistical and mathematical backgrounds, the following four 
general courses with a lecture each morning were offered, although 
most students preferred to attend all courses. (1) Theoretical foun- 
dations, M. P. Geppert (W. G. Kerckhoff Institute, Bad Nauheim). 
Designed for students with some mathematical background, this course 
covered topics such as probability, one and two variable stochastic 
distributions, random sampling for attributes and for measurements, 
and the major tests of significance. (2) Applied statistical methods, 
C. A. B. Smith (University College, London). An introduction with 
no statistical prerequisites, this dealt with the basic statistical procedures, 
including tests of significance, analysis of variance, simple experimental 
design, regression and matrices. (3) Design of sampling surveys and 
experiments, F. J. Anscombe (Cambridge University). For students 
familiar with basic statistical methods, this considered sampling from 
both homogeneous and non-homogeneous populations and its relation 
to experimental designs. (4) Single degrees of freedom, L. L. Cavalli- 
Sforza (Istituto Sieroterapico, Milan). Primarily for beginners, 
this course dealt with individual comparisons in x’ analysis and in 
the analysis of variance. The last two lectures in the fourth course 
were given by G. Pompilj on a short-cut substitute for the analysis of 
covariance. An early morning discussion by Dr. Geppert complemented 
her lectures in course (1); each of the other courses provided a daily 
practical exercise. 

In the afternoon, the students were divided for practical work at 
the calculators into two groups, one for agricultural and science students 
from 2 to 4 p. m. and the other for medical students from 4 to 6 p. m. 
In the time free of practicals, students attended daily special lectures. 
For the science group, these concerned Statistical Genetics by M. 
Siniscaleo, Biometrical Genetics by R. Scossiroli and P. Dassat, Agri- 
cultural Experimentation by P. V. Sukhatme, and Sampling Problems 
by A. Linder and H. Furgag. The medical group attended lectures on 
the application of statistics to Demography and Hygiene by A. Tizzano, 
to Clinical Work by G. Barbensi and A. Linder, and to Bioassay by 
G. A. Maccacaro. In the daily group discussion from 6 to 7:30 p. m., 
R. Seossiroli and G. Maccacaro reviewed the practical exercises for each 
day and their solution. Sir Ronald Fisher gave an additional short 
course of six lectures for a restricted group on “The logie of inductive 
inference’. 


3 
} 
i 
: 
\ 
| 
} 
Wer 


VARENNA SEMINAR 95 

Student evaluation. At the end of the Seminar, students filled in 
unsigned questionnaires designed to obtain information for use in 
organizing future similar seminars. The fifty-five questionnaires which 
were returned may be summarized as follows. 

The students were about equally divided among those receiving 
complete, partial and no financial support from their employers for 
attending the seminar. Twenty considered the number attending 
too large, and the remainder (35) that it was approximately right. 
Most students considered the presence of students from different 
biological fields useful (31) or indifferent (19), only four feeling that this 
was objectionable. There was much less approval, however, for the 
mixture of different statistical and mathematical backgrounds, 26 
considering this objectionable, 19 indifferent and only 9 useful. This 
opinion was not related to the background of the respondent. Con- 
sidering their normal working duties and expenses, 80 percent of the 
students thought that the Seminar was of about the right duration. 

As noted above, the course content was about as concentrated as 
possible, and there was a slight preference (30) for a less concentrated 
course. Nearly all students would have welcomed summaries of the 
lectures. Thirty-three considered the time spent at the calculator 
about right, 17 as too short, and only 4 as too long. Twice as many 
found working at the calculators in pairs useful as found it troublesome. 
Guided exercises showing all the steps were preferred by 80 percent 
of the students to those giving very few of the steps, although some 
students preferred a gradual shift from the first to the second type. 
More students preferred to have only the final results of the practicals 
given on the exercise sheet (26), as compared with no answers (15) or 
both intermediate and final results (14). During the practicals, one 
or two students with a better-than-average background knowledge 
worked the problems a day in advance and assisted the other students. 
Eighty percent considered this help sufficient, those who found it 
insufficient having less background knowledge than the others. Although 
a 10-page glossary of definitions and formulas was distributed at the 
beginning, most students never referred to it. There was general 
agreement that organized discussions should have more time. 

Four questions dealt with content. Of the 16 requests for additional 
topics, six were considered as well-grounded; and of the six statements 
listing subjects deemed useless, most referred to details. Of the special 
courses, that on bioassay was the most popular, those given in English 
or French had little attendance. Most suggestions for additional special 
courses were too specialized to be feasible. 

In lieu of examinations, which proved too unpopular to attempt, 


i 
| 
3 
q 
| 
| 
{ 
| 
H 
4 


96 . BIOMETRICS, MARCH 1956 


students were asked to score each practical exercise on a scale from 
0 to 2 for their knowledge of the underlying statistical method at the 
start and at the end of the course and its potential usefulness. Taken 
at face value, the mean gain in the score suggested that the average 
student almost tripled his initial knowledge. 

Conclusions. Despite the common complaint that “there was 
no time for sedimentation’ and the importance for mathematically 
untrained biologists of a slow process of “digestion’’, the present ex- 
perience has convinced the author that short concentrated courses 
in biometry can be useful. There is currently a real hunger for such 
courses and the demand is likely to increase until more universities 
have established regular courses in biometry with sufficient practical 
work. Apart from instruction in the local language, an international 
teaching body and a pleasant but isolated locale, the enthusiasm of 
the people attending the Seminar from the beginning to the very end 
contributed most to its success. For future similar programs, the 
following suggestions seem pertinent. 

(1) The amount of work should be decreased to eight hours a day 
for three weeks or less, but the general discussions allowed more time. 
This would mean dropping most of the special lectures, but incorporat- 
ing bioassay in one of the general courses. The time spent on theoretical 
foundations might be reduced, although students with an interest in 
learning why and not just how should be encouraged. 

(2) More stratification of the background knowledge of the student 
should be useful, although few biologists in Italy are prepared for 
courses at a higher level. It is not easy to assess the initial knowledge 
of applicants and those who start from scratch may profit the most 
with elementary teaching. 

(3) Discussion groups should be smaller, although this increases 
the number of teachers. If conducted by others than the main lecturers, 
a given problem is more likely to be looked at from different angles. 

Other suggestions will be apparent from the-responses to the 
questionnaires. In general, the project was considered very much 
worthwhile and we trust that it can be continued in coming years. 


| 
8 
‘ : | 


NEWS AND NOTES 


Professor G. W. Snedecor is currently on assignment in Brazil as 
Consultant in Experimental Statistics, to assist in the design and 
analysis of agricultural experiments and to help complete arrangements 
for a Research and Training Center of Statistics in the State of Sao 
Paulo. The project is under the auspices of the Institute of Statistics, 
University of North Carolina, with financial assistance from the Rocke- 
feller Foundation. His headquarters will be in Campinas for five months 
beginning January 9, 1956. 


Summer Sessions at Berkeley, California. 


The 1956 summer program in the Department of Statistics of the 
University of California, Berkeley, California, will consist of two 
sessions: June 18 to July 28, and July 30 to September 8. The faculty 
of the summer sessions will include Professor D. R. Cox of the University 
of North Carolina, Professor Grace E. Bates of Mount Holyoke College, 
and Professor David Blackwell and Mr. T. 8. Ferguson of the Depart- 
ment of Statistics of the University of California. 

The program includes two of the usual undergraduate courses in 
each session, adapted primarily to meet the needs of students trans- 
ferring from other centers who would like to undertake advanced study 
at the University of California during the regular academic year. Also 
a graduate seminar will be conducted by Professor Blackwell. This 
seminar will allow for individual consultation for students working 
toward higher degrees. 


Southern Regional Graduate Summer Session in Statistics. 


A continuing integrated program of graduate summer sessions in 
statistics, to be held in rotation at Virginia Polytechnic Institute, 
Florida University and North Carolina State College was begun in 1954. 
This year’s session will be held at North Carolina State College, Raleigh, 
June 11 through July 20. Eleven courses will be offered in advanced 
calculus, elementary and advanced statistical methods, theory and 
analysis, stochastic processes, sample survey designs, econometric 
methods, linear programming and special problems. Seminars will be 
held twice weekly, and there will also be special Social Science Research 
Council lectures on linear equations and on production functions. A 
maximum of two courses may be taken for residence credit towards a 
graduate degree at any of the cooperating institutions, as well as at 
some other universities. 


97 


2 
] 
| 
| 


98 BIOMETRICS, MARCH 1956 


Summer session faculty will comprise D. B. Duncan, Statistical 
Laboratory, University of Florida; W. L. Smith, Dept. of Statistics, 
University of North Carolina; and C. Harrell, Dept. of Economies, J. 
Levine, Dept. of Mathematics, and R. L. Anderson, Gertrude M. Cox, 
A. L. Finkner, A. H. E. Grandage, R. J. Hader and R. J. Monroe, Dept. 
of Experimental Statistics, North Carolina State College. 


: 
4 
J 


a 
fal 
a 
A 
| 


